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Seven  single  copy  DNA  probes  were  isolated  that  span 

110  kilobases  of  the  Murine  X  region  and  used  in  a 

Restriction  Fragment  Length  Polymorphism  (RFLP)  analysis 

with  five  restriction  endonucleases  on  genomic  DNAs  from 

28  H-2  homozygous  wild  and  laboratory  inbred  mice. 

Polymorphic  restriction  sites  were  used  to  cluster  alleles 

at  each  locus  to  form  lineages  representing  groups  of 

minor  variants  of  a  single  progenitor  allele.   These 

lineages  were  then  used  to  identify  recombinational  events 

between  the  loci  probed.   Three  recombinational  hotspots 

(RHS)  were  identified  from  the  26  unique  I  region 

haplotypes  analyzed.   These  RHSs  are  located;  1)  in  the 

second  intron  of  Ej8,  2)  at  the  centromeric  end  of  Ea,  and 

3)  approximately  5kb  telomeric  of  the  Aa  gene.   The  Efi   and 

Ea  RHSs  correspond  to  those  already  documented  while  the 

RHS  adjacent  to  Aa  has  not  been  previously  defined.   This 

RHS  maps  to  a  4.7  kb  stretch  of  DNA  3'  of  the  Aa  gene  and 
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its  activity  appears  to  be  haplotype  dependent.   The  three 
RHSs  separate  the  I  region  into  four  genomic  segments 
where  the  sequences  within  a  particular  segment  accumulate 
mutations  at  the  same  rate.   These  segments  were  termed 
recombinationally  depressed  segments  (REDS)  since 
recombination  is  localized  to  the  RHSs  with  only  a  few 
rare  recombinational  events  occurring  within  a  defined 
REDS.   These  REDS  were  grouped  into  lineages  which 
represent  a  limited  number  of  evolutionary  units  which  are 
shuffled  between  haplotypes  during  evolution.   The  genes 
within  a  REDS,  for  example,  Aa  and  A)3  in  REDSl,  show  a 
strong  linkage  disequilibrium  which  results  in  the 
coordinate  evolution  of  these  two  genes.   In  the  case  of 
the  A  molecule,  this  linkage  disequilibrium  between  these 
co-expressed  genes  appears  to  be  necessary  for  the  proper 
expression  on  the  cell  surface.   This  same  pattern  of 
evolution  is  seen  in  the  t  haplotype  mice  which  contain  a 
large  number  of  wild  type  alleles  suggesting  a  much  higher 
degree  of  recombination  between  these  two  different  forms 
of  chromosome  17  than  previously  expected. 
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CHAPTER  I 
INTRODUCTION 


The  murine  major  histocompatibility  complex  (MHC) 
located  on  chromosome  17  and  called  H-2  is  a  multigene 
family  coding  for  polymorphic  surface  glycoproteins 
involved  in  cell  recognition  and  the  generation  of  immune 
responses  to  foreign  antigen.   The  I  region,  spanning 
approximately  120  kilobases  of  DNA,  lies  within  H-2  and 
encodes  the  two  class  II  molecules,  I-A  and  I-E.   These 
class  II  or  la  molecules  are  heterodiraers  composed  of  an  a 
and  y8  chain  which  non-covalently  associate  in  the 
cytoplasm  and  are  expressed  predominately  on  B  lymphocytes 
and  activated  macrophage  (Flavell  and  Widera  1986) . 

The  genes  for  the  class  II  molecules  are  arranged  from 
the  centromere  in  the  order  Afi ,    ha,    Ej3,  Ej32,  and  Ea.   The 
hP ,    Aa,  Ej3,  and  to  a  lesser  extent,  Ea  molecules,  are  very 
polymorphic  and  show  many  distinct  forms  or  alleles  at  the 
protein  and  DNA  level.   The  set  of  alleles  present  in  the 
I  region  of  a  specific  mouse  constitutes  its  haplotype. 

The  high  amount  of  polymorphism,  i.e.  the  large  number 
of  distinct  alleles,  makes  the  H-2  ideal  for  the  study  of 
homologous  recombination  and  its  influence  on  the 
evolution  of  the  regions  adjacent  to  the  sites  of 
recombination.   Homologous  recombination  is  a  mechanism  by 
which  homologous  nucleotide  sequences  or  allelic  sequences 
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on  homologous  chromosomes  are  exchanged  with  high  fidelity 
during  meiosis.   Homologous  recombination  can  occur 
anywhere  along  a  chromosome.   It  has  been  observed  that 
recombination  frequencies  can  vary  for  different  stretches 
of  DNA  of  the  same  length  and  that  genetic  map  distances 
do  not  always  agree  with  molecular  map  distances 
(Steinmetz  et  al .  1982b) .   This  suggests  that  there  are 
regions  of  DNA  that  either  concentrate  or  suppress 
recombinational  events.   A  site  where  recombination 
appears  to  be  localized,  first  identified  in  procaryotes, 
has  been  termed  a  recombinational  hotspot  (RHS)  (Song 
1985) . 

The  H-2  contains  four  documented  RHSs  with  two  falling 
within  the  I  region  (Steimetz  et  al.  1987) .   The  RHS  in 
Efi ,    defined  by  12  breakpoints,  is  localized  to  a  lOkb 
stretch  of  DNA  and  the  RHS  in  Ea  is  defined  by  7 
breakpoints  localized  to  a  12-14kb  stretch  just 
centromeric  to  the  gene  (Steinmetz  et  al .  1982b;  Lafuse  et 
al.  1986) .   All  RHSs  in  H-2  show  three  characteristics  in 
common:  1)  high  frequency  of  homologous  recombination,  2) 
localization  to  a  small  stretch  of  DNA,  and  3)  haplotype 
specificity  (Steinmetz  et  al.  1987) . 

The  presence  or  absence  of  an  active  RHS  in  different 
individuals  would  be  expected  to  have  a  distinct  influence 
on  the  generation  of  haplotypes  over  an  evolutionary 
timespan.   Homologous  equal  recombination  would  shuffle 
and  generate  new  combinations  of  alleles  which  would  lead 
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to  new  haplotypes  in  a  population.   Depending  on  the 
number  of  active  RHSs  in  a  population,  the  extent  of 
allele  shuffling  would  vary  as  would  the  number  of  unique 
haplotypes.   Because  recombination  appears  to  be  localized 
to  specific  sites  within  the  I  region,  markers  located  in 
these  regions  flanked  by  RHSs  should  show  linkage 
disequilibrium. 

The  aims  of  this  dissertation  are  to  survey  a  large 
collection  of  independently  derived  I  region  haplotypes 
and  to  identify  and  localize  RHSs.   Once  characterized,  I 
wanted  to  determine  the  influence  of  these  RHSs  on  the 
generation  of  I  region  haplotypes  and  the  relationships  of 
the  genes  flanked  by  RHSs. 


CHAPTER  II 
REVIEW  OF  THE  LITERATURE 


The  major  histocompatibility  complex  (MHC) ,  located 
on  chromosome  17  in  the  mouse,  was  first  characterized 
based  on  its  involvement  in  graft  rejection  between 
different  inbred  mouse  lines  (Little  and  Tyzzer  1916) . 
With  the  advent  of  serologic  techniques  and  their 
application  to  the  study  of  the  H-2  complex  (Gorer  1936) , 
the  genetics  of  this  region  began  to  interest  more  and 
more  biologists.   From  this  flourishing  interest  and 
advances  in  various  chemical  and  molecular  techniques,  a 
more  exact  picture  of  the  organization,  structure  and 
function  of  the  H-2  and  its  gene  products  has  emerged. 


Organization  of  the  Major  Histocompatibility  Complex 
Genomic  Characteristics  and  Structure  of  Encoded  Products 


The  murine  major  histocompatibility  complex,  commonly 
referred  to  as  H-2 .  is  a  large  multigene  family  which 
codes  for  the  cell  surface  glycoproteins  involved  in  cell 
recognition  and  the  control  of  immune  responses  to  foreign 
antigens.   The  H-2  complex,  in  genetic  map  distances, 
encompasses  approximately  2  centiMorgans  of  DNA  (Klein 
1975)  which  translates  into  a  physical  distance  of  2000  to 
4  000  kilobases  (Hood  et  al.  1982) .   The  H-2  encodes  three 
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classes  of  immune  related  proteins:  class  I,  class  II  and 
class  III  (Klein  1975) .   Based  on  functional  parameters, 
the  H-2  has  been  divided  into  four  regions  which 
correspond  to  the  classes  of  molecules  which  they  encode. 
The  K  and  D  regions  contain  the  class  I  genes,  the  I 
region  which  contains  the  class  II  genes,  and  the  S  region 
contains  the  class  III  genes.   The  class  I  products  fall 
into  2  general  catagories,  those  involved  in  graft 
rejection  and  those  related  to  development.   The  first 
group,  the  classical  transplantation  antigens,  is  encoded 
by  genes  denoted  K,  D,  L  and  R  and  is  expressed  on  the 
surface  of  all  nucleated  cells.   Although  they  mediate 
heterologous  graft  rejection  in  the  laboratory,  this  is 
not  their  function  in  vivo.   These  class  I  molecules 
function  in  the  restricted  presentation  of  viral  and  tumor 
antigens  to  cytotoxic  T  lymphocytes  (Zinkernagel  1979) . 
The  second  group  of  class  I  molecules  is  of  two  families. 
The  Qa  family  of  molecules  is  expressed  on  mammalian 
nucleated  blood  cells  and  the  Tla  molecules  are  expressed 
on  certain  leukemias  (Michaelson  et  al.  1983) .   Whereas 
the  classical  transplantation  antigens  are  very 
polymorphic,  the  Qa  and  Tla  antigens  exhibit  very  low 
polymorphism  and  their  functions  are  still  unknown 
(Flaherty  1980) .   Molecular  cloning  and  analysis  of  the  H^ 
2   have  revealed  over  32  genes  spanning  at  least  800  bps 
for  the  Qa  and  Tla  products  alone  (Steinmetz  et  al. 
1982a) .   The  class  I  molecules  show  a  unified  structure  of 
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three  extracellular  domains,  a  transmembrane  domain  and  a 
cytoplasmic  domain  which  constitutes  a  40-45,000  dalton 
glycoprotein  of  approximately  350  amino  acids.   This 
glycoprotein  chain  non-covalently  associates  with  a  12,000 
dalton  molecule  encoded  on  chromosome  2  known  as  ^2 
microglobulin  (Klein  et  al.  1983b) . 

The  class  III  genes,  contained  within  the  S  region, 
encode  the  complement  proteins  C2 ,  Bf,  Sip,  and  C4 . 
Although  these  genes  are  physically  contained  in  the  H-2. 
Klein  and  Figueroa  (1981)  argue  against  their  inclusion  in 
the  MHC  because  they  are  not  functionally  related  to  the 
class  I  or  class  II  loci. 

The  class  II  genes  are  contained  within  the  I  region, 
or  immune  response  region,  which  was  first  defined  by  the 
differential  ability  of  inbred  mouse  strains  to  mount  an 
immune  response  to  certain  antigens  (McDevitt  and  Sela 
1965;  Martin  et  al.  1971)  and  were  latter  mapped  by  the 
use  of  recombinant  and  congenic  strains  of  mice 
(Benacerraf  and  McDevitt  1972) .   There  are  two  class  II 
molecules  encoded  within  the  I  region,  I-A  and  I-E, 
assembled  from  four  functional  class  II  genes.   These 
genes  are  A)9,  Aa,  Efi   and  Ea  as  well  as  the  pseudogenes 
AjS3,  AJ32   and  E;32  (Widera  and  Flavell  1985).   A  molecular 
map  of  the  H-2  and  the  I  region  is  given  in  Figure  2-1. 
These  class  II  molecules  are  composed  of  a  35,000  dalton  a 
and  29,000  dalton  /3  chain  of  about  220  and  230  amino  acids 
respectively  (Klein  et  al.  1983b)  which  non-covalently 
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associate  in  the  cytoplasm  and  are  subsequently  expressed 
on  the  surface  of  the  cell  as  a  heterodimer.   The  a  and  p 
chains  are  organized  similarly  into  five  protein  domains. 
They  consist  of  a  hydrophobic  leader  peptide  of  25  amino 
acids,  two  90  amino  acid  extracellular  domains  (ala2, 
/31;82)  ,  a  hydrophobic  transmembrane  segment  of  2  5  amino 
acids  and  a  cytoplasmic  domain.   The  domain  structures  of 
the  a2,  )31  and  132   regions  are  formed  due  to  disulfide 
bonds  between  pairs  of  cystine  residues  located  within 
each  domain.   The  domain  organization  of  the  protein 
directly  reflects  the  intron/exon  organization  of  their 
respective  genes. 

The  13   chain  genes  of  the  class  II  molecules  are 
composed  of  six  exons,  one  for  each  protein  domain,  and  an 
exon  for  the  3'  untranslated  region  (Saito  et  al.  1983). 
The  a  genes  are  very  similar  except  they  are  composed  of 
five,  instead  of  six  exons  due  to  the  transmembrane  and 
cytoplasmic  regions  being  combined  in  a  single  exon 
(Mathis  et  al.  1983;  McNicholas  et  al.  1982).   A  diagram 
of  the  organization  of  the  class  II  a  and  /3  genes  is  given 
in  Figure  2-2. 

The  Inclusion  of  the  MHC  Genes  Into  the  Immunoglobulin 
Superqene  Family 

With  the  advent  of  cloning  and  sequencing  techniques, 

a  very  detailed  analysis  of  class  I  and  class  II  gene 
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structure  could  be  performed.   Comparisons  of  protein  and 
DNA  sequences  reveal  a  very  similar  domain  structure  for 
most  of  the  genes  of  the  immune  system,  with  the  domain 
organization  reflecting  the  intron/exon  organization  of 
the  genes  which  encode  them.   This  is  true  for  the  class 
II,  class  I,  Thy-1,  /32-microglobulin,  T4 ,  T8 ,  T  cell 
receptor  and  immunoglobulin  genes  (Kaufman  et  al.  1984; 
Benoist  et  al.  1983;  McNicholas  et  al.  1982;  Parnes  and 
Seidman  1982;  Larhammar  et  al.  1982;  Sukhatme  et  al.  1985; 
Maddon  et  al.  1985;  Hood  et  al.  1983;  Davis  1985).   The 
domain  and  sequence  homology  within  the  membrane  proximal 
domains  among  the  genes  of  the  immune  system  has  led  to 
the  theory  that  these  genes  arose  from  a  single  ancestral 
gene  through  gene  duplication.   The  strong  similarity 
between  the  membrane  proximal  domains  of  the  molecules  of 
the  immune  system  based  on  size  and  structure  argues 
strongly  for  the  divergent  evolution  of  a  single  ancestral 
gene  following  gene  duplication  events  (Hood  et  al.  1983) . 

I  Recfion  Organization 

Before  the  advent  of  molecular  cloning,  the  I  region 
was  considered  by  immunologists  to  consist  of  four 
subregions  as  determined  by  recombinational  analysis  based 
on  serologic  and  immune  response  assays  (Klein  1975;  Klein 
et  al.  1983a;  Mengle-Gaw  and  McDevitt  1985).   The  four 
defined  subregions  were  I-A,  I-B,  I-J,  and  I-E.   The  I-A 
and  I-E  subregions  were  serologically  defined  and  encode 
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the  conventional  la  antigens.   The  genes  for  A/3,  Aa,  and 
E)8  map  to  the  I-A  subregion,  whereas  Ea  maps  to  the  I-E 
subregion  (Jones  et  al.  1978;  Murphy  et  al .  1980).   The  I- 
B  subregion  was  defined  by  the  regulation  of  immune 
responses  to  IgG2a  ^^^d  lactate  dehydrogenase  (Lieberman  et 
al.  1972;  Melchers  et  al.  1973).   The  I-B  subregion  later 
became  defunct  as  shown  by  Dorf  and  Benacerraf  (1975)  by 
the  explanation  of  this  immune  response  phenotype  being 
controlled  by  the  complimentation  of  two  genes,  one  from 
the  I-A  and  I-E  subregion,  respectively.   The  I-J 
subregion  was  defined  serologically  by  reagents  directed 
against  an  I-J  polypeptide,  which  was  believed  to  be  a 
suppressor  factor  from  suppressor  T  lymphocytes  (Murphy  et 
al.  1976;  Murphy  et  al.  198  0) .   However,  attempts  to 
isolate  and  purify  I-J  in  sufficient  quanity  for  protein 
sequence  analysis  have  failed.   Molecular  characterization 
of  the  I  region  by  Steinmetz  et  al.  (1982b)  in  the 
recombinant  strains  used  to  define  I-J  showed  that  the 
product  of  I-J  would  have  to  be  encoded  by  a  3.4  kilobase 
stretch  of  DNA.   Sequence  analysis  of  this  fragment  showed 
that  the  was  no  gene  which  could  code  for  the  I-J  product 
(Kobori  et  al^  1986) . 

The  exact  order  and  number  of  the  class  II  genes  came 
into  view  when  240,000  contiguous  base  pairs  of  the  I 
region  were  cloned  from  the  BALB/c  mouse  (Steinmetz  et  al. 
1982b) .   Four  class  II  genes  were  identified  with  one 
being  a  pseudogene  due  to  the  lack  of  hybridization  with  a 


14 
5'  probe.   It  was  determined  that  the  BALB/c  genome 
contains  two  a  genes  and  from  four  to  six  (3   genes.   This 
was  confirmed  in  latter  work  by  Widera  and  Flavell  (1985) . 
The  positions  of  the  genes  for  A/3,  Aa,  E/3,  E/32,  and  Ea 
were  conclusively  mapped  within  the  I  region. 

Subsequently,  two  other  class  II  P   genes  were 
discovered  and  determined  to  be  pseudogenes.   Larhammar  et 
al.  (1983a)  identified  Aj32   and  positioned  it  approximately 
2  0  kilobases  centromeric  to  A/3.      The  A)32  gene  was 
sequenced  (Larhammar  et  al.  1983b)  and  the  exon/intron 
organization  was  found  to  be  the  same  as  for  the  other 
class  II  /3  genes.   The  A)32  molecule,  from  the  predicted 
amino  acid  sequence,  shows  only  56%  homology  to  the  other 
(3   chains,  in  contrast  to  the  typical  homology  of  around 
80%  seen  among  these  13   chains.   Based  on  this,  A/32  was 
determined  to  be  the  most  divergent  member  of  the  class  II 
P   genes.   Widera  and  Flavell  (1985)  isolated  and 
characterized  AJ33   and  localized  it  to  75  kilobases 
telomeric  of  the  K  region.   Steinmetz  et  al.  (1986)  were 
able  to  link  the  A)33  gene  from  BALB/c  to  the  rest  of  the  I 
region  thereby  providing  a  continuous  600  kilobase  map  of 
the  K  and  I  regions.   The  pseudogene  h/32    shows  strong 
homology  to  the  other  13   genes  and  83%  homology  to  the 
human  SBjS  gene.   Whereas  the  A;92  gene  is  transcribed 
(Larhammar  et  al.  1983a)  but  not  expressed  on  the  cell 
surface  due  to  splicing  errors,  A)33  shows  an  8  nucleotide 
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deletion  which  would  make  transcription  of  the  gene 
impossible. 

Figure  2-3  illustrates  the  content  and  organization 
of  the  I.  region.   The  genes  of  the  I.  region  are  arranged 
centromerically  in  the  order  A;83,  A)S2,  A)9,  Aa,  E0 ,    E)32  and 
Ea,  and  span  approximately  3  00  kilobases  of  DNA  with  the 
functional  class  II  genes  confined  to  a  110  kilobase 
stretch. 

Class  II  Gene  Polymorphism 

The  MHC  genes  are  the  most  polymorphic  loci  known  for 
vertebrates  and  have  made  the  class  I  and  class  II  genes 
of  great  interest  to  investigators.   Based  on  serologic 
and  molecular  studies,  it  has  been  determined  that,  in 
general,  the  A/3,  Aa,  and  E/3  chains  are  the  most 
polymorphic  (Klein  1975;  Benoist  et  al .  1983),  and  Ea 
being  the  least  polymorphic  (Klein  et  al.  1983a) .   The 
genes  of  the  class  II  molecules  have  been  shown  to  exhibit 
the  same  degree  of  polymorphism  as  the  protein  products 
they  encode.   This  unique  variability  of  the  class  II 
genes  is  therefore  a  reflection  of  the  unique  biological 
role  of  these  molecules  with  respect  to  the  immune  system. 
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Mechanisms  for  the  Generation  of  Polymorphism 

The  entire  region  containing  the  class  II  genes  has 
been  cloned  and  analyzed  extensively  (Larhammar  et  al. 
1983b;  Choi  et  al.  1983;  Benoist  et  al.  1983;  Widera  and 
Flavell  1985;  Steinmetz  et  al.  1986).   When  the  I  region 
of  several  laboratory  inbred  strains  of  mice  were  compared 
using  single  copy  probes  spanning  the  I  region,  a  variable 
tract  was  found  in  the  telomeric  half  of  the  I.  region 
characterized  by  extensive  sequence  diversity  determined 
by  an  RFLP  analysis,  and  a  conserved  tract  on  the 
centromric  end  showing  very  low  sequence  diversity,  with 
the  two  tracts  meeting  at  the  3  '  portion  of  the  E/3  gene 
(Steinmetz  et  al.  1984)  .   The  genes  for  A)8,  Aa,  and  the  5' 
end  of  E/3  occupy  the  60  kilobases  that  compose  the 
variable  tract.   The  conserved  tract,  spanning  50 
kilobases,  contains  the  genes  for  E;32  and  Ea.   Non-coding 
sequences  within  the  two  tracts  showed  the  same  patterns 
of  diversity.   The  mechanisms  maintaining  this  pattern  of 
diversity  are  not  known  but  perhaps  extensive  sequence 
comparisons  would  shed  some  light  on  this  unknown. 

Nucleotide  sequence  comparisons  of  A)3,  Aa,  and  Ej8 
reveal  extensive  sequence  polymorphism  within  laboratory 
strains  of  mice  (Benoist  et  al.  1983;  Estess  et  al.  1986). 
Variations  in  nucleotide  sequence  of  5%  to  10%  are  not 
unusual  between  alleles  of  A)3  and  Aa  with  the  most 
diversity  localized  within  the  pi   and  al  domains.   Choi  et 
al .  (198  3)  did  a  sequence  comparison  on  genomic  clones  of 
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A)3  from  the  b,  d,  and  k  haplotypes  and  determined  that  the 
majority  of  amino  acid  substitutions  are  localized  to  the 
animo  terminus  of  the  encoded  molecule.   These  mutations 
are  indicative  of  a  pattern  of  multiple  independent 
events.   Recent  work  by  McConnell  et  al .  (1988)  on  A/3 
reveals  evidence  for  segmental  exchange  between  alleles  to 
generate  diversity. 

Benoist  et  al .  (1983)  sequenced  six  alleles  of  Ao  for 
the  k,  d,  b,  f ,  u,  and  g  haplotypes  and  found  that  most 
substitutions  within  the  al  exon  are  clustered  into  what 
they  term  "regions  of  allelic  hypervariability . "   A  Kabat- 
Wu  variability  plot  (Kabat  et  al.  1979)  of  the 
corresponding  animo  acid  sequence  reveals  that  amino  acid 
substitutions  fall  into  two  hypervariable  regions  at 
residues  11-15  and  at  residues  56-57  which  correspond  to 
the  regions  of  the  molecule  responsible  for  the  binding  of 
foreign  antigen  (Brown  et  al.  1988) . 

Mengle-Gaw  and  McDevitt  (1983)  have  reported  regions 
of  allelic  hypervariability  between  alleles  of  Ej8  also. 
These  regions  of  diversity  are  localized  to  the  /31  exon 
and  are  separated  by  tracts  of  sequence  homology  which  the 
authors  suggest  might  reflect  diversification  by  gene 
conversion. 

The  mechanisms  for  the  generation  of  diversity  of  the 
class  II  genes  are  unknown;  however,  two  hypotheses 
dominate  speculations.   The  first  hypothesis  proposes  that 
new  alleles  in  a  population  are  generated  by 
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hypermutational  mechanisms  such  as  gene  conversion  or 
segmental  exchange.   Segmental  exchange  or  gene  conversion 
was  originally  defined  in  fungi  (Radding  et  al.  1978)  and 
is  a  mechanism  by  which  DNA  sequence  is  copied  or 
transferred  to  or  from  genes,  usually  belonging  to 
multigenic  or  multiallelic  families  (Baltimore  1981; 
Robertson  1982).   During  meiosis  or  mitosis,  there  is 
pairing  of  partially  homologous  sequences  followed  by 
mismatch  repair  thereby  converting  part  of  one  sequence  to 
that  of  another.   Gene  conversion  events  are  characterized 
by  clusters  of  substitutions  at  the  DNA  level.   This 
pattern  of  diversity  is  clearly  documented  in  class  I 
genes  (Mellor  et  al.  1983;  Weiss  et  al.  1983)  and  evidence 
for  the  same  mechanism  of  diversification  of  the  class  II 
genes,  although  to  a  lesser  extent,  is  also  seen  (Mengle- 
Gaw  et  al.  1984;  Widra  and  Flavell  1985;  McConnell  et  al. 
1988)  . 

As  mentioned  earlier,  the  majority  of  mutations 
within  the  class  II  genes  appear  to  be  clustered  into 
tracts  of  allelic  hypervariability .   Direct  evidence  for 
gene  conversion  in  class  II  genes  has  been  reported  by 
Mengle-Gaw  et  al .  (1984) ,  where  an  alloreactive  T  cell 
clone  reacted  with  determinants  present  on  both  E(3"   and 
;^^bml2^   Sequence  comparisons  between  A)3^,  A^S^^^^^  ^^d  E/3^ 
(Choi  et  al.  1983;  Mclntyre  and  Seidman  1984)  reveal 
sequence  homology  between  bml2  and  Efi^   where  it  differs 
from  A)3^.   The  region  that  is  exchanged,  encompassing 
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approximately  14  nucleotides,  is  flanked  by  regions  of 
exact  homology  extending  for  distances  of  20  base  pairs 
either  side  of  the  recombinational  event. 

By  examining  the  nucleotide  sequence  of  eight  alleles 
of  A/3,  McConnell  et  al.  (1988)  found  that,  for  six  of  the 
eight  alleles,  the  evolutionary  lineage  of  the  ^1   and  j92 
exons  corresponds  to  the  presence  or  absence  of  a 
retroposon  insertion  within  the  second  intron  which  is 
used  to  define  these  lineages.   The  pi   exon  of  two 
alleles,  AJS^   and  A/3^°'^ ,    did  not  reflect  their  evolutionary 
lineage  by  RFLP,  and  therefore  reflects  the  exchange  of 
sequence,  by  segmental  exchange,  from  alleles  of  a 
different  evolutionary  lineage. 

The  second  hypothesis  for  the  generation  of 
diversity,  termed  "trans-species  evolution,"  proposes  that 
the  polymorphism  arose  from  the  steady  accumulation  of 
mutations  over  long  evolutionary  periods,  and  multiple 
advantageous  alleles  have  survived  speciation  (Klein 
1980) .   Trans-species  evolution,  therefore,  represents  a 
mechanism  for  the  maintenance  of  diversity  in  natural 
populations.   A  recent  report  has  shown  (McConnell  et  al. 
1988)  that  90%  of  115  hj3   alleles  examined  by  RFLP  analysis 
fall  into  two  evolutionary  lineages  based  on  the  presence 
or  absence  of  a  short  interspersed  nucleotide  element 
(SINE) .   Using  the  SINE  sequence  as  an  evolutionary  tag 
for  the  analysis  of  nine  separate  species  and  sub-species 
of  the  genus  Mus,  the  authors  determined  that  the  SINE 
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sequence  could  be  identified  in  species  that  diverged  over 
eight  million  years  ago.   Therefore,  these  alleles 
containing  the  retroposon  insertion  must  have  survived 
speciation  suggesting  the  role  of  trans-species  evolution 
in  the  generation  of  polymorphism  seen  in  modern  Mus 
species. 

The  above  findings  indicate  that  both  hypermutational 
mechanisms  and  trans-species  evolution  contributes  to  the 
diversity  of  class  II  genes.   The  diversity  within  class 
II  genes  is  localized  to  the  regions  of  the  molecule 
responsible  for  antigen  binding,  suggesting  that  selection 
for  functional  diversity  in  the  binding  sites  of  these 
molecules  may  maintain  these  polymorphisms  in  natural 
populations.   Taken  together,  this  suggests  that  strong 
selective  pressures  play  an  important  role  in  the 
maintenance  of  MHC  polymorphism. 

Functional  Role  of  MHC  Polymorphism 

The  MHC  molecules  are  involved  in  cell  recognition 
and  generation  of  the  immune  response  to  foreign  antigen. 
The  interaction  of  foreign  antigen,  the  class  II 
molecules,  and  the  T  cell  receptor  determines  if  an  animal 
can  mount  an  immune  response.   Therefore,  the  MHC  mole- 
cules play  a  pivotal  role  in  the  survival  of  the  animal 
when  challenged  by  pathogens  in  their  natural  environment. 


■-3?!: 


23 
Recfulation  and  Expression  of  MHC  Molecules 

Class  II  molecules  are  expressed  predominantly  on  two 
cell  types,  collectively  called  antigen  presenting  cells 
(APC) ,  and  typlified  by  the  macrophage  and  the  B 
lymphocyte.   It  is  well  documented  that  macrophage  and 
macrophage-like  cells  play  a  fundamental  role  in  the 
induction  of  immune  responses.   The  interaction  of  the 
regulatory  T  lymphocyte  and  the  APC  is  under  the  control 
of  the  I  region  of  the  MHC,  termed  MHC  restriction,  and 
the  ability  of  the  APC  to  present  antigen  is  dependent  on 
the  cell  surface  expression  of  a  class  II  molecule  (Unanue 
1983) .   It  has  been  demonstrated  that  the  expression  of 
class  II  is  not  constitutive  in  macrophages  and  can  come 
under  both  positive  and  negative  control  (Steinman  et  al. 
1980;  Snyder  et  al.  1982) .   In  the  case  of  positive 
control,  McNicholas  et  al.  (1982)  have  shown  that  factors 
secreted  from  mitogen  activated  spleen  cells  induce  the 
biosynthesis  and  cell  surface  expression  of  MHC  antigens. 
Subsequent  studies  have  determined  this  factor  to  be 
gamma-interferon  (Steeg  et  al.  1982;  King  and  Jones  1983). 
When  macrophages  are  incubated  with  immune  interferon, 
there  is  a  coordinate  increase  in  mRNA  for  the  four  class 
II  chains  within  an  eight  hour  period  (Paulnock-King  et 
al.  1985) .   Further  studies  on  class  I  induction  on 
macrophages  by  gamma-interferon  suggest  the  role  of  a 
common  sequence  in  the  promoter  of  the  genes,  in 
association  with  a  functional  enhancer  sequence,  necessary 
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for  the  induction  of  their  expression  (Israel  et  al. 
1986) . 

The  B  lymphocyte,  in  contrast  to  the  macrophage, 
shows  a  heterologous  constitutive  level  of  class  II  on  its 
cell  surface  (Mond  et  al.  1981;  Monroe  and  Cambier  1983). 
Mitogen  activated  T  cell  supernatants  were  shown  to 
increase  the  levels  of  cell  surface  la  on  resting  B  cells 
(Roehm  et  al.  1984) ,  and  later  studies  identified  this 
factor  as  B-cell  stimulatory  factor  1  (BSF-1)  (Noelle  et 
al.  1984).   BSF-1,  or  interluekin  4,  induces  mRNA  levels 
within  one  hour  and  cell  surface  levels  as  early  as  two 
hours  (Polla  et  al.  1986) . 

These  two  mechanisms  of  class  II  induction  reflect 
the  importance  of  the  cell  surface  expression  of  class  II 
for  the  interactions  of  the  regulatory  T  lymphocytes  and 
the  APC  for  the  initiation  of  an  immune  response. 

Chain  Association  and  the  Functional  Expression  of  Class 
II  Molecules 

Early  studies  hava  shown  that  the  class  II  molecules 

are  heterodimeric  in  nature  requiring  the  association  of 

an  a  and  13   chain  (Cullen  et  al.  1974;  Jones  et  al.  1978). 

By  evaluating  the  functional  role  of  the  class  II 

molecules,  it  was  observed  that  certain  immune  responses 

in  recombinant  mice  of  the  b  and  k  haplotypes  mapped  to 

separate  subregions  of  the  I  region  and  were  therefore 

under  the  control  of  two  genes  (Jones  et  al.  1978) . 
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It  was  noted  that  some  laboratory  inbred  mice  carry 
mutations  that  cause  the  failure  of  expression  of  the 
class  II  E  molecule  on  the  cell  surface  (Jones  et  al» 
1981)  .   The  cloning  and  analysis  of  the  I.  region  by 
Steinmetz  et  al.  (1982b)  showed  that  the  genes  for  Ej3  and 
Ea  are  present  in  the  strains  of  mice  that  do  not  express 
an  E  molecule.   These  defects  in  expression  fall  into 
three  catagories  (Hyldig-Nielson  et  al .  1983;  Mathis  et 
al .  1983) :   The  H-2^  and  H-2^  haplotypes  have  a  deletion 
in  Ea,  the  H-2^  haplotypes  makes  an  Ea  message  of  abarent 
size,  and  the  H-2^  haplotype  has  a  defect  in  the  stability 
of  the  Ea  message.   Lack  of  E  molecule  expression  has  been 
documented  to  be  as  high  as  3  0%  in  wild  mice  with  levels 
of  50%  in  the  t  haplotypes,  which  can  be  found  in 
frequencies  of  up  to  40%  in  wild  populations  (Nizetic  et 
al .  1984) .   Eighteen  t  haplotype  strains  were  shown  to 
lack  expression  of  an  E  molecule  (Dembic  et  al.  1984) . 
Sixteen  of  the  eighteen  strains  carry  a  deletion  in  the 
promoter  of  Ea  identical  to  that  seen  in  inbred  mouse 
strains.   The  three  non-expressing  strains  which  do  not 
carry  this  deletion  carry  a  mutation  where  the  gene  is 
transcribed  but  no  protein  is  expressed  on  the  cell 
surface.   These  three  mutations  represent  the  extreme  case 
where  changes  in  one  chain  of  the  class  II  molecule  effect 
cell  surface  expression  and  the  ability  of  an  animal  to 
mount  an  immune  response  to  certain  antigens. 
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There  are  no  reports  of  the  lack  of  expression  of  an 
A  molecule  within  natural  populations  of  mice.   The 
importance  of  maximizing  the  amount  of  class  II  variation 
is  believed  to  be  reflected  in  the  observation  that  a  and 
/3  chains  of  a  given  isotype  (i.e.  A  or  E)  can 
transassociate  in  heterozygotes  (Fathman  and  Kimoto  1981) . 
These  findings  have  given  rise  to  the  notion  of  free 
association  among  alleles  within  an  isotype.   Studies  on 
wild  derived  haplotypes,  by  the  analysis  of  tryptic 
peptide  fingerprints  from  serologically  related  groups  of 
mice  (Wakeland  and  Klein  1983),  show  that  Aa  and  A^   within 
these  strains  differ  by  less  than  10%  of  their  tryptic 
peptides  (Wakeland  and  Darby  1983)  .   RFLP  analysis  of  A)3 
and  Aa  for  this  same  allelic  family  corroborate  this 
observation  at  the  DNA  level  (McConnell  et  al .  1986) . 
Recent  studies  that  indicate  that  polymorphism  can 
dramatically  affect  the  Aa  and  A/3  subunits'  ability  to 
assemble  as  an  A  molecule  for  functional  expression  on  the 
cell  surface  (Germain  et  al.  1985;  Gilfillan  et  al.  1988). 
In  these  studies,  Aa  and  Ay3  genes  from  different 
haplotypes  were  either  co-transfected  or  introduced  into 
transgenic  mice.   It  was  observed  that,  haplotype 
mismatched  chains  can  not  effectively  associate  to  get 
appreciable  levels  of  the  transassociated  pairs.   Taken 
together,  these  findings  suggest  that,  in  order  for  proper 
assembly  and  cell  surface  expression,  the  a  and  13   chains 
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of  the  A  molecule  need  to  be  co-adapted  and,  therefore,  be 
from  the  same  or  similar  haplotype. 

Role  of  Class  II  in  the  Presentation  of  Foreign  Antigen 
It  is  the  interaction  of  foreign  antigen,  class  II 
molecules  and  the  T  cell  receptor  which  determines  if  an 
animal  will  mount  an  immune  response.   Unlike  the  B 
lymphocyte,  the  T  cell's  receptor  cannot  bind  and 
recognize  free  antigen  (Moller  1978;  Moller  1980).   It  is, 
therefore,  the  role  of  the  class  II  molecule  to  present 
antigen  in  such  a  way  to  enable  the  T  lymphocyte  to 
respond  and  initiate  an  immune  response. 

The  T  cell  receptor  must  recognize  a  bimolecular 
ligand  composed  of  the  antigen  and  the  MHC  class  II 
molecule  (Schwartz  1985)  .   Studies  have  shown  that  most  T 
cells  recognize  non-native  forms  of  the  antigen  as  seen 
with  lysozyme  (Adorini  et  al .  1979) ,  ovalbumin 
(Shimonokevitz  et  al.  1983) ,  myoglobin  (Streicher  et  al. 
1984)  ,  and  insulin  (Falo  et  al .  1986) .   Conversion  of 
antigen  from  a  native  to  a  non-native  form  is  termed 
antigen  processing,  and  it  is  performed  by  APCs  which 
express  class  II  antigens  (Allen  1987) . 

The  nature  of  the  interaction  of  MHC  molecules  and 
processed  foreign  antigen  is  of  great  interest  for  the 
understanding  of  the  functional  role  of  MHC  polymorphism. 
An  early  study  by  Babbitt  et  al.  (1985)  revealed  that 
immunogenic  peptides  from  hen  egg  lysozyme  bind 
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specifically  to  class  II  molecules  from  a  responder,  but 
not  a  non-responder,  haplotype.   Subsequent  studies  have 
focused  on  the  residues  responsible  for  this  interaction 
and  the  exact  nature  of  the  binding  between  antigen  and 
class  II  (Buus  et  al.  1986;  Sette  et  al.  1987;  Buus  et  al. 
1987) .   Recently,  the  three  dimensional  structure  of  a 
class  I  molecule  was  determined  by  X-ray  crystallography 
(Bjorkman  et  al.  1987)  and  because  of  the  significant 
domain  and  sequence  homologies  between  class  I  and  class 
II  molecules,  Brown  et  al.   (1988)  propose  a  similar  model 
for  the  class  II  molecule  as  determined  for  class  I 
molecules.   The  cell-surface  portions  of  each  subunit 
contain  two  domains  (al,  a2 ,  /81,  and  )32)  in  which  the  al 
and  pi   domains  are  postulated  to  jointly  form  the  binding 
site  which  interacts  with  peptide  antigens.   The  binding 
site  is  a  groove  produced  by  two  parallel  alpha  helixes 
which  rest  atop  a  platform  formed  by  an  eight  strand  beta 
pleated  sheet.   The  al  and  jSl  domains  of  the  class  II 
molecules  are  postulated  to  donate  one  of  the  alpha 
helixes  and  four  strands  of  the  beta  pleated  sheet  each 
which  combine  symmetrically  to  form  the  binding  site.   The 
position  of  the  polymorphic  residues  within  the  al  and  /31 
domains  have  been  shown  to  occupy  sites  within  this  groove 
thereby  representing  contact  points  between  MHC,  antigen, 
and  the  T  cell  receptor.   This  model  has  not  been 
confirmed  as  of  yet  by  X-ray  crystallography,  but  conforms 
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to  a  variety  of  structural  and  functional  studies  (Allen 
et  al^  1987;  Buus  et  al.  1987;  Guillet  et  al.  1987). 

Taken  together,  these  data  show  the  importance  of  the 
combinatorial  associations  of  the  a  and  0   chains  of  the 
class  II  molecules.   The  influence  of  polymorphisms  on 
their  association  can  drastically  affect  the  development 
of  an  effective  immune  response.   Maintenance  of  co- 
adapted  a   and  /3  chains  will  insure  the  proper  assembly  and 
expression  of  a  class  II  molecule. 

Homolocfous  Recombination  Within  the  MHC 

The  mouse  MHC  offers  a  unique  opportunity  for 
investigating  whether  homologous  meiotic  recombination 
happens  at  random  or  at  specific  sites  within  the  genome. 
Many  recombinant  strains  of  mice  have  been  characterized, 
and  the  advent  of  molecular  cloning  has  made  it  possible 
to  map  the  crossover  sites. 

Hotspots  of  Homologous  Recombination 

The  DNA  in  the  genome  is  not  a  static  structure  and 
undergoes  reorganizational  events  during  evolution  and 
development.   There  is  exchange  of  nucleotide  sequences 
between  chromosomes  by  homologous  and  non-homologous 
recombination.   Recombination  is  a  mechanism  by  which  DNA 
sequences  are  exchanged  between  homologous  and  non- 
homologous chromosomes  either  during  meiosis  or  mitosis. 
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There  are  three  types  of  recombination;  non-homologous, 
homologous  equal,  and  homologous  unequal.   Non-homologous 
recombination  occurs  between  sites  within  distinct 
structural  environments  which  may  be  located  on  the  same 
or  different  chromosomes.   Non-homologous  recombination  is 
also  referred  to  as  site  specific  recombination  and  acts 
in  the  differentiation  of  some  prokaryotic  and  eukaryotic 
cells.   Some  specific  examples  of  non-homologous 
recombination  are  the  integration  or  excision  of 
bacteriophages  and  bacterial  transposons  (Bauer  et  al. 
1984;  Calos  and  Miller  1980),  and  the  rearrangement  of 
immunoglobulin  and  T  cell  receptor  genes  in  eukaryotes 
(Tonegawa  1983;  Davis  1985).   Non-homologous  recombination 
usually  represents  mitotic  or  somatic  events. 

Homologous  recombination,  or  allelic  recombination, 
occurs  between  homologous  or  allelic  nucleotide  sequences 
on  homologous  chromosomes.   Homologous  recombination  can 
generate  new  combinations  of  alleles  by  the  exchange  of 
sequences  between  homologous  chromosomes.   Homologous 
equal  recombination  breaks  and  rejoins  nucleotide 
sequences  at  precisely  the  same  position  whereas 
homologous  non-equal  recombination  cuts  and  rejoins  at 
different  locations  on  homologous  chromosomes  leading  to 
the  accumulation  of  duplications  and  deletions. 

Homologous  recombination  can,  theoretically,  occur 
anywhere  along  the  chromosome.   It  has  been  recognized 
that  recombination  frequencies  can  vary  for  different 
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stretches  of  DNA  of  the  same  size  and  that  genetic  map 
distances  do  not  always  agree  with  molecular  map  distances 
(Steinmetz  et  al .  1982b) .   Both  these  observations  suggest 
that  recombination  is  site  specific  where  there  are 
regions  that  either  enhance  or  suppress  recombinational 
events.   A  site  where  recombination  is  localized  has  been 
termed  a  recombinational  hotspot  (RHS)  (Smith  1983) . 

Recombinational  hotspots  were  first  reported  in 
bacteriophage  lambda  where  mutants  arose  that  grew  better 
in  E.  coli  than  the  wild  type  phage  due  to  a 
recombinational  event  localized  to  an  eight  nucleotide 
sequence.   This  sequence,  GCTGGTGG,  or  Chi  (for  crossover 
hotspot  instigator) ,  enhances  recombination  leading  to 
better  growth  in  the  host  bacteria  (Smith  1983).   The 
function  of  the  Chi  sequence  has  been  determined  to  be  the 
stimulation  of  homologous  recombination.   It  exerts  its 
greatest  activity  within  10  kilobases  upstream,  and  there 
appears  to  be  a  second  sequence  involved  which  is  located 
downstream  (Smith  et  al.  1981) .   Chi  sequences  are  found 
in  E.  coli  at  a  frequency  of  one  every  five  kilobases 
(Malone  et  al.  1978)  .   The  Chi  sequence  may  therefore 
represent  a  molecular  basis  for  recombination. 

Homologous  recombination  has  also  been  documented  in 
eukaryotic  systems,  such  as  yeasts,  which  also  serve  as  a 
vector  system  for  the  study  of  recombination  prone 
sequences  (Song  1985)  .   Recombination  has  been  documented 
in  the  human  genome  within  the  /3-globin  gene  cluster 
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(Orkin  and  Kazazian  1984) .   By  examining  haplotype 
associations  of  polymorphic  restriction  endonuclease 
sites,  recombination  could  be  localized  to  a  9.1  kilobase 
stretch  of  DNA  located  between  the  5-globin  gene  and  the 
first  exon  of  the  )3-globin  gene.   Because  several 
haplotypes  carry  identical  mutations  in  the  5'  genomic 
segment  which  are  found  in  association  with  different  3' 
genomic  segments,  this  site  was  determined  to  be  a 
recombinational  hotspot. 

Recombination  Within  H-2 

Hotspots  for  homologous  recombination  have  been 
identified  and  characterized  in  the  MHC  of  the  mouse. 
Because  of  the  ability  to  breed  homozygous  mouse  strains 
and  the  large  number  of  distinct  alleles  for  the  genes 
within  H-2 .  researchers  have  been  able  to  compile  an 
extensive  collection  of  intra-I  region  recombinant 
congenic  inbred  mouse  strains.   These  strains  were  first 
identified  serologically  and  functionally  by  demonstrating 
the  co-expression  of  two  distinct  parental  epitopes  for 
the  A  and  E  molecule  in  a  single  offspring  (Stimpfling  and 
Durham  1972;  Benacerraf  and  Dorf  1976). 

Molecular  cloning  and  characterization  of  the  class 
II  genes  made  it  possible  to  locate  the  recombinational 
breakpoints  at  the  molecular  level.   Steinmetz  and  co- 
workers (1982b)  did  a  molecular  characterization  of  nine 
intra-I  region  recombinant  strains  and  found  that  all  the 
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recombinational  events  map  to  a  single  site  within  the  I 
region.   This  site  is  localized  to  a  3.4  kilobase  region 
encompassing  the  second  intron  of  the  E)3  gene.   These 
breakpoints  were  further  characterized  through  southern 
blot  analysis  by  Kobori  et  al .  (1984) ,  and  were  localized 
to  only  2.0  kilobases.   Sequence  analysis  of  three 
parental  and  four  I  region  recombinants  reveals  that,  in 
three  of  the  recombinants,  the  recombinational  event 
occurs  within  a  1  kilobase  region  of  DNA  (Kobori  et  al. 
1986) .   Several  subsequent  studies  have  identified  more 
intra-I  region  recombinants  in  which  the  breakpoints  map 
to  this  RHS  (Saha  and  Cullen  1986a,  1986b;  Lafuse  and 
David  1986).   In  all,  there  have  been  12  breakpoints 
defined,  which  are  localized  within  10  kilobases  of  DNA 
spanning  the  E/3  gene. 

Shiroshi  and  co-workers  (1982)  examined  a  congenic 
mouse  strain,  BIO.MOL-SGR  (Mus  musculus  molossinus)  and 
found  a  tremendously  enhanced  frequency  of  recombination 
between  the  K  and  A  locus.   Steinmetz  et  al.  (1986) 
examined  a  similar  mouse,  CAS4  (Mus  musculus  castaneus) 
which  shows  a  recombination  rate  as  high  as  1.5%  within 
this  same  portion  of  the  genome  which  encompasses  40 
kilobases. 

A  third  recombinational  hotspot  was  identified  in 
another  strain  of  M.  m.  castaneus.  CAS3,  which  exhibits  a 
recombination  rate  of  0.6%  with  breakpoints  localized  to  a 
9.5  kilobase  stretch  of  DNA  between  kj33    and  Aj32  (Steinmetz 
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et  al .  1986) .   Further  analysis  of  the  nucleotide  sequence 
of  five  similar  recombinant  haplotypes  revealed  that  all 
the  breakpoints  are  confined  to  a  3.5  kilobase  region  of 
DNA  (Uematsu  et  al .  1986) .   Of  the  breakpoints  examined, 
all  show  homologous  recombination  without  any  DNA 
sequences  duplicated  or  deleted  between  the  parental  and 
recombinant  haplotypes. 

Recently,  a  fourth  RHS  has  been  identified  which  maps 
to  a  12  to  14  kilobase  region  centromeric  to  the  Ea  gene 
as  characterized  by  seven  breakpoints  (Lafuse  and  David 
1986) .   Figure  2-4  gives  a  map  of  the  RHSs  within  the  MHC. 
Earlier  data  from  serologic  and  tryptic  peptide 
fingerprints  (Singh  et  al.  1981;  Wakeland  and  Darby  1983) 
have  provided  evidence  for  the  existence  of  a  fifth  RHS 
within  the  MHC  which  maps  between  the  Aa  and  E)3  genes  in 
some  wild  derived  haplotypes.   Data  presented  in  this 
dissertation  help  to  confirm  the  existence  of  this  site  of 
recombination. 

Haplotype  Specificity  of  Recombination  within  the  MHC 

The  presence  of  RHSs  at  a  given  site  depends  on  the 
haplotype  of  the  MHC  involved  in  the  genetic  cross. 
Recombination  within  the  hotspot  in  Eft   is  exhibited  by 
strains  of  the  b  and  k  haplotypes  in  genetic  crosses 
producing  recombinant  offspring  (Steinmetz  et  al.  1982b; 
Lafuse  et  al.  1986) .   Recombination  within  two  distinct  M. 
m.  castaneus  haplotypes,  cl  and  c4,  reveal  two  distinct 
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recombinational  hotspots  in  the  interval  between  K  and  Aj82 
(Steinmetz  et  al.  1986;  Uematsu  et  al.  1986). 
Recombinational  events  in  the  c^  haplotype  mapped  to  the 
Aj93/AjS2  hotspot,  whereas  recombination  in  the  c4  haplotype 
occurred  in  the  K/A/33  hotspot.   Crossing  over  in  the 
hotspot  between  Ej32  and  Ea  has  so  far  only  been  detected 
in  crosses  of  the  e  haplotype  (Lafuse  et  al.  1986;  Lafuse 
and  David  1986) . 

Molecular  Basis  For  Recombinational  Hotspots  in  the  MHC 

There  are  similarities  between  the  Chi  sequence  in  E. 
coli  and  the  recombinational  hotspots  in  the  MHC.   As  seen 
for  Chi,  breakpoints  are  clustered,  only  homologous 
recombination  is  seen,  and  the  activity  of  the  RHS  appears 
to  be  dominant.   Therefore,  the  nucleotide  sequences 
around  the  RHSs  were  examined  for  sequences  with  homology 
to  Chi.   Studies  by  Steinmetz  et  al.  (1986)  and  Kobori  et 
al.  (198  6)  showed  that  the  region  around  the  EJ3   RHS 
contains  a  sequence  composed  of  a  tetramer,  CAGG,  which  is 
repeated  2  2  times.   This  sequence  has  limited  homology  to 
the  Chi  sequence  of  bacteriophage  lambda,  which  is  known 
to  promote  recombination.   A  much  stronger  degree  of 
homology  is  found  between  this  sequence  and  the  core 
sequence  of  human  minisatellite  DNA,  which  may  facilitate 
recombination  in  human  chromosomes  (Jeffreys  et  al .  1985) . 
There  is  no  functional  evidence,  however,  to  suggest  that 
these  repeated  sequences  are  important  in  recombination 
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within  the  MHC.   Sequence  comparisons  can  only  suggest 
possible  control  sequences  for  recombination,  and  only 
functional  assays  can  identify  structural  elements 
required  for  recombination. 

Wild  Mice 

The  goals  of  this  dissertation  are  to  survey 
polymorphism  at  loci  evenly  spaced  across  the  I  region  and 
to  determine  the  location  and  influences  of 
recombinational  hotspots  on  the  evolution  of  I  region 
haplotypes  in  modern  species  of  Mus.   Previous  studies  on 
this  subject  were  limited  in  scope  due  to  the  nature  of 
the  strains  of  mice  used  to  address  these  questions. 
Inbred  mice  represent  only  a  small  subset  of  highly  biased 
haplotypes.   These  strains  are  derived  from  a  limited 
number  of  sources  that  were  generated  from  a  high  degree 
of  inbreeding,  thereby  representing  a  biased  sampling  of 
the  mouse  population  and  an  artificial  collection  of 
considerable  homogeneity  (Ferris  et  al.  1982;  Klein  1974). 

Wild  mice  represent  a  population  whose  breeding  is 
not  controlled  by  humans  (Bruell  1970) ,  and,  therefore, 
represent  a  collection  of  I  region  haplotypes  of 
considerable  heterogeneity.   These  mice  also  represent  the 
product  of  evolutionary  processes  where  the  I  region 
haplotypes  are  fixed  and  maintained  through  natural 
selective  pressures. 
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Natural  History  of  the  Wild  Mouse 

Wild  mice  can  be  divided  into  three  groups  based  on 
their  associations  with  humans  (Sage  1981) .   Aboriginal 
mice  are  free  living  mice  with  essentially  no  interaction 
with  humans.   Commensal  mice,  on  the  other  hand,  live  in 
close  association  with  humans,  and  in  most  cases  rely  on 
humans  for  their  source  of  food  and  shelter.   The  third 
group,  feral  mice,  represent  mice  which  have  made  the 
transition  from  a  commensal  association  back  to  the 
aboriginal  state. 

The  commensal  mice  fall  into  four  subspecies  of  Mus 
musculus;  M.m.domesticus,  M.m.musculus.  M.m.castaneus.  and 
M.m.molossinus  (Marshall  1981) .   The  research  in  this 
dissertation  is  concerned  only  with  mice  of  the 
M.m.domesticus  subspecies.   Based  on  fossil  evidence, 
nuclear  genetic  variation  and  mitochondrial  genetic 
variation,  Ferris  et  al.  (1983)  estimate  that  this 
commensal  relationship  between  mouse  and  man  has  been  in 
existence  for  more  than  1  million  years. 

In  contrast  to  aboriginal  mice  whose  range 
encompasses  only  the  Eurasian  continent,  commensal  mice, 
also  indigenous  to  Eurasia,  have  radiated  to  the  new  world 
around  much  the  same  time  as  the  first  human  settlers. 
The  commensal  mice  have  adapted  remarkably  well  to 
extremely  varied  climatic  conditions  with  habitats  ranging 
from  Europe,  the  Americas,  Australia,  Africa,  and  several 
South  Pacific  islands.   They  may  represent  the  most 
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evolutionarily  advanced  member  of  the  genus  (Marshall 
1981) . 

M.m. Domesticus  is  presently  found  throughout  the 
world,  and  within  its  native  range,  can  be  found  in 
habitats  as  diverse  as  households,  agricultural  fields,  in 
barren  rocky  ravines  (Gaisler  1975;  Hassinger  1973),  salt 
marshes  (Breakey  1963) ,  grasslands  (Pearson  1963) ,  coal 
mines  (Philip  1938) ,  and  mountain  enviroments  (Harland 
1958)  . 

The  t  Complex 

The  t  complex  is  a  gene  complex  located  on  the 
centromeric  one  third  of  chromosome  17  adjacent  to  H-2 . 
and  accounts  for  nearly  1%  of  the  mouse  genome.   There  are 
two  major  structural  forms  of  chromosome  17,  the  wild  type 
and  the  t  form.   The  t  form  is  carried  in  10%  to  40%  of 
wild  mice  (Artzt  et  al.  1985;  Dembic  et  al.  1984).   A 
complete  t  haplotype  is  one  that,  by  definition, 
suppresses  recombination  along  the  entire  12  centiMorgan 
region  from  the  gene  locus  Brachyury  (T)  to  the  H-2 
complex.   The  different  t  haplotypes  are  all  structurally 
related  to  one  another.   Within  the  chromosomal  region 
occupied  by  the  t  haplotypes,  there  are  genes  common  to 
non-t  bearing  mice  and  mutant  genes  characteristic  of  the 
t  complex. 

Mutant  genes  within  the  t  haplotypes  have  been  shown 
to  cause  abnormalities  in  tail  length,  embryogenesis. 
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fertility,  male  transmission  ratios,  and  meiotic 
recombination  (Dunn  and  Gluecksohn-Schoenheimer  1950; 
Silver  1985) .   The  t  complex  has  been  termed  "selfish  DNA" 
(Klein  et  al .  198  6)  which  serves  no  apparent  purpose  other 
than  self  propagation  and  dissemination  throughout  a 
population.   Two  reasons  for  the  prevalence  of  t 
chromosomes  in  the  wild  are  their  ability  to  sway  their 
own  transmission  and  their  ability  to  keep  the  genetic 
elements  responsible  for  segregation  distortion  together 
by  the  suppression  of  recombination. 

The  molecular  nature  of  the  segregation  distorters  is 
not  known.   It  is  well  documented  that  wild  males  carrying 
a  complete  t  haplotype  will  transmit  their  t  chromosome  to 
greater  than  90%  of  their  offspring  (Lyon  and  Meredith 
1964a,  1964b) .   Mice  carrying  a  partial  t  haplotype  can 
transmit  it  only  when  complimented  by  another  chromosome 
which  can  restore  the  transmission  distorter  (Silver 
1985) .   Lyon  (1984)  suggests  that  a  series  of  distorter 
loci,  Ted,  can  act  on  a  single  responder  locus,  Tcr.   The 
Ted  loci  act  in  an  additive  fashion  in  either  a  cis  or 
trans  configuration  to  the  Tcr,  and  when  the  additive 
effect  of  the  Ted  loci  reach  a  certain  level,  a  high 
degree  of  transmission  of  that  chromosome  is  seen. 

The  mechanism  responsible  for  the  suppression  of 
meiotic  recombination  is  much  more  straight  forward  than 
for  transmission  distortion.   The  partial  t  haplotypes 
were  an  important  tool  in  elucidating  the  molecular  basis 


42 

of  the  recombination  suppression.   The  region  of 
suppression  in  the  partial  t  haplotypes  extends  only  as 
far  as  the  t  DNA  present  (Bechtol  and  Lyon  1978;  Bennett 
et  al .  1978) .   Normal  levels  of  recombination  are  observed 
between  t  chromosomes  as  opposed  to  the  wild  type.   This 
suggests  that  the  structure  of  the  t  haplotypes  are 
similar  to  each  other,  yet  different  than  wild  type  DNA 
(Artzt  et  al.  1982a;  Condamine  et  al .  1983)  . 
Subsequently,  it  was  shown  that  the  t  haplotypes  have  a 
proximal  inversion  encompassing  T  and  the  Tcp  (t  complex 
proteins)  products  (Herrmann  et  al.  1986) ,  and  a  distal 
inversion  containing  tf  (tufted  locus)  and  H-2  (Artzt  et 
al.  1982b;  Shin  et  al.  1983;  Shin  et  al.  1984). 
Recombination,  therefore,  is  suppressed  between  t  and  the 
wild  type  due  to  the  inversion  of  these  regions. 

The  other  characteristics  of  the  t  complex,  sterility 
and  lethality,  have  been  suggested  to  be  secondary  add-ons 
to  the  primary  properties  of  the  t  chromosome; 
transmission  distortion  and  recombination  suppression 
(Klein  et  al .  1986) .   This  region  of  the  chromosome  is 
believed  to  carry  genes  instrumental  in  embryogenesis  and 
development  which  have  become  mutated,  hence  the  lethality 
and  sterility  seen,  and  which  are  carried  along  with  the 
"selfish  DNA"  and  disseminated  throughout  wild 
populations.   This  hypothesis  is  supported  by  reports  that 
that  lethality  mutations  appear  to  be  single  locus 
mutations  which  can  compliment  each  other  in  genetic  tests 
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(Bennett  1975;  Klein  et  al.  1984;  Winking  and  Guenet 
1978) . 

Due  to  the  inclusion  of  the  MHC  in  the  recombination 
suppression  of  the  t  haplotypes,  the  association  of  the 
alleles  at  MHC  loci  with  the  t  haplotypes  is  of  great 
interest.   t  forms  of  chromosome  17  are  believed  to  be  of 
single  founder  origin,  or  at  least  of  a  limited  founder 
origin  (Klein  et  al.  1986) .   This  suggests  that  the  H-2 
haplotype  associated  with  a  t  haplotype  will  represent  a 
unique  and  separate  evolutionary  lineage  than  those  seen 
in  the  wild  type.   Figueroa  et  al.  (1985)  have  reported 
the  existence  of  three  groups  of  class  II  alleles 
associated  with  particular  t  haplotypes.   Dembic  et  al. 
(1984)  and  Nizetic  et  al .  (1984)  have  shown  a  correlation 
between  a  deletion  in  Ea  and  its  association  with  the  t 
haplotypes.   This  identical  deletion  is  seen  in  wild  type 
forms  of  chromosome  17  also,  which  has  been  interpreted  as 
an  ancient  origin  of  this  deletion,  but  which  also  may  be 
interpreted  as  having  been  introduced  through 
recombination  with  the  wild  type.   A  recent  study  has 
shown  alleles  for  A)3  shared  between  t  and  the  wild  type 
(McConnell  et  al.  1988) .   This,  in  conjunction  with  data 
in  this  dissertation,  suggest  a  higher  degree  of 
recombination  between  the  t  and  wild  type  forms  of 
chromosome  17. 
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Class  II  Gene  Polyinorphisms  in  Wild  Mice 

Evidence  for  the  presence  of  H-2  specificities  which 
are  unique  to  wild  mice,  and  not  present  in  panels  of 
laboratory  strains,  led  to  the  quest  for  new  alleles  from 
wild  mouse  populations.   Serology  was  a  powerful  tool  for 
the  analysis  and  characterization  of  these  H-2 
specificities,  but  there  was  a  problem  in  separating  these 
reactivities  from  non-H-2  antigens  and  other  H-2 
specificities  in  the  heterozygous  animal.   This  problem 
led  to  the  development  of  wild  derived  congenic  mouse 
lines  on  a  BIO  backround,  collectively  referred  to  as  the 
BIO.W  congenic  lines  (Klein  1973,  1975).   Wild  male  mice 
were  bred  with  BIO. BR  female  mice  and  the  offspring  were 
backcrossed  8  to  14  times  with  the  continual  selection  of 
an  H-2  marker  specific  for  the  wild  haplotype.   These 
lines  were  maintained  by  brother  x  sister  matings  and  the 
wild  H-2  haplotypes  selected  for  on  a  C57BL/10  background. 

Serologic  examination  of  the  BIO.W  lines  revealed  the 
extreme  polymorphisms  of  the  class  II  genes  within  wild 
populations  (Klein  1975;  Zaleska-Rutcznska  and  Klein 
1977) .   Of  the  16  haplotypes  examined,  a  few  appear 
identical  to  inbred  haplotypes,  a  few  are  identical  to 
each  other,  but  the  majority  represent  novel  haplotypes 
which  are  not  seen  in  laboratory  inbred  strains.   Later 
serologic  analysis  of  29  wild  derived  haplotypes  by 
Wakeland  and  Klein  (1979a;  1981),  revealed  three  new  I 
region  haplotypes;  u,  v,  and  ±.      These  same  types  of 
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analyses  of  wild  haplotypes  show  evidence  for  possible 
recombination  events  (Duncan  and  Klein  1980;  Wakeland  and 
Klein  1979b) ,  which  points  out  the  value  of  wild  derived 
H-2  haplotypes  for  the  study  of  recombination. 

Tryptic  peptide  mapping,  in  conjuction  with  the 
serologic  data,  demonstrate  that  many  of  the  haplotypes 
can  be  grouped  into  families  of  varient  alleles  (Wakeland 
and  Klein  1983) .   The  nature  of  the  variations  between 
alleles  of  the  A  molecule  within  these  families  were 
investigated,  and  it  was  found  that  these  alleles  differ 
by  less  than  10%  of  their  tryptic  peptides.   Most  of  these 
differences  are  localized  in  the  al  and  /31  domains  of  the 
A  molecule  (Wakeland  and  Darby  1983;  Wakeland  et  al. 
1985) .   Studies  at  the  DNA  level  confirm  the  relatedness 
of  these  alleles,  and  have  added  more  insight  into  the 
mechanisms  which  play  a  role  in  the  generation  of 
diversity  of  class  II  molecules  (McConnell  et  al.  1986, 
1988) . 

Wild  derived  lines  have  continually  shown  evidence 
for  recombination  within  the  I  region  (Duncan  and  Klein 
1980;  Wakeland  and  Klein  1979b;  Wakeland  and  Darby  1983; 
Singh  et  al .  1981) .   A  recent  study  by  Soper  and  co- 
workers (1988)  reveals  the  prevalence  of  the  RHS  in  E)9  in 
the  diversification  of  I  region  haplotypes  within  a  small 
panel  of  wild  mice.   Singh  et  al.  (1981)  demonstrated  that 
recombination  commonly  occurs  between  Aa  and  Ej0,  a  fact 
which  has  not  been  borne  out  at  the  molecular  level  until 
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the  work  in  this  dissertation.   These  data  suggest  the 
important  role  of  the  wild  mouse  in  the  understanding  of 
the  evolution  of  the  I  region  and  its  application  to  the 
study  of  recombination. 


CHAPTER  III 
MATERIALS  AND  METHODS 


Mice 


All  mice  used  in  this  study  were  from  the  mouse 
colony  in  the  Tumor  Biology  Unit  at  the  Department  of 
Pathology  and  Laboratory  Medicine,  University  of  Florida, 
or  from  our  wild  mouse  colony  located  at  the  Animal  Care 
Facility,  University  of  Florida.   Strains  included  in  this 
analysis  are  listed  in  Table  3-1.   The  wild  derived  mouse 
strains  were  maintained  by  brother  x  sister  matings  and 
are  homozygous  at  the  H-2  complex  unless  otherwise  noted. 
t  haplotype  mice  were  supplied  by  Dr.  Joseph  Nadeau, 
Jackson  Laboratories,  and  were  maintained  as  heterozygotes 
due  to  the  lethality  of  the  t  mutations  carried  by  these 
strains. 

Antibody  Isolation  and  Conjugation 

Monoclonal  antibody,  14.4.4,  (Ozato  et  al.  1980)  was 
produced  by  injecting  0 . 5  ml  containing  2  x  10^  hybridoma 
cells  interperitoneally  into  sub-lethally  irradiated, 
Pristan  (Sigma,  St.  Louis,  MO)  primed,  male  BALB/c  mice. 
Ascites  fluid  was  harvested  every  2  days  for  a  1  month 
period.   The  ascites  fluid  was  run  over  protein  A- 
sepharose  column  (Pharmacia  Fine  Chemicals,  Uppsala)  at  50 
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ml/hr  and  then  washed  with  PBS  until  the  absorbance  at  280 
nm  was  at  baseline  as  determined  spectrophotometricly . 
IgG  was  eluted  from  the  column  with  a  solution  of  0.58% 
acetic  acid,  0.15  M  NaCl  and  the  eluate  collected  in  5  ml 
fractions.   The  eluted  IgG  was  dialyzed  18  hours  at  4°C 
against  pH  9.3  carbonate/bicarbonate  buffer  (17.3  g 
NaHC03/8.6  g  NaC03  in  1  liter  H2O) .   Fluorescein 
isothiocyanate  (FITC)  (Sigma,  St.  Louis,  MO)  was  dissolved 
in  dimethyl  sulfoxide  at  a  concentration  of  1.0  mg/ml  and 
the  appropriate  amount  was  added  to  the  IgG  and  allowed  to 
react  at  room  temperature  for  2  hours.   This  mixture  was 
loaded  on  a  G-75  (Bio-Rad  Laboratories,  Richmond,  CA) 
column  in  PBS  plus  0.1%  NaN3  and  the  first  colored  band 
was  collected  and  used  in  subsequent  experiments. 


Spleen  Cell  Isolation.  Immunostaininq.  and 
Flow  Cytometric  Analysis 


Freshly  explanted  spleens  were  minced  through  wire 
screens  to  make  single  cell  suspensions.   Red  blood  cells 
were  lysed  with  an  ammonium  sulfate  solution  (0.5%  w/v) 
and  washed  extensively  with  PBS.   1  x  10^  cells  were 
resuspended  in  PBS  plus  0.1%  NaN3  and  incubated  with  a 
1:150  dilution  of  the  FITC  conjugated  antibody  for  30 
minutes  at  4°C.   Samples  were  washed  3  times  with  PBS  and 
brought  up  in  0.5  ml  for  flow  cytometry.   Cells  were 
passed  through  a  44  /i  nylon  mesh  filter  and  then  run  on  a 
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TABLE 

3-1.   H-2  Homozygous 

Wild  and  I 

nbred  Mice. 

Strain  of 

Mus  m.  domesticus 

H-2 

Geographic  Origin 

BIO 

b 

old  inbred 

B10.D2 

d 

old  inbred 

BIO.M 

f 

old  inbred 

BIO.WB 

J 

old  inbred 

BIO. BR 

k 

old  inbred 

BIO.F 

P 

old  inbred 

BIO.Q 

g 

old  inbred 

BIO.RIII 

r 

old  inbred 

BIO.S 

s 

old  inbred 

BIO. PL 

u 

old  inbred 

BIO.SM 

V 

old  inbred 

B10.SAA48 

w3 

Michigan 

B10.KEA5 

w5 

Michigan 

B10.CAA2 

wll 

California 

B10.STC77 

wl4 

Michigan 

B10.STC90 

wl5 

Michigan 

B10.CHA2 

w26 

Michigan 

STU 

w34 

West  Germany 

AZROUl 

W201 

Morocco 

FAIYUM3 

w206 

Egypt 

FAIYUM4 

w207 

Egypt 

FAIYUM5 

w208 

Egypt 

JERUSALEM3 

W215 

Israel 

JERUSALEM4 

W216 

Israel 

METKOVICl 

w217 

Yugoslavia 

METK0VIC2 

W218 

Yugoslavia 

METK0VIC3 

W219 

Yugoslavia 

W12A 

W233 

Amsterdam 

TT6 

t6 

C3H.tw5 

tw5 

C3H.tw8 

tw8 

C3H.twl2 

twl2 

C3H.tw32 

tw32 

C3H.tw71 

tw71 

C3H.tw75 

tw75 
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FACS  II  fluorescence  activated  cell  sorter  (Becton- 
Dickinson,  Mountain  Veiw,  CA)  at  a  flow  rate  of  200-250 

cells/second. 

Isolation  of  genomic  DNA 

Genomic  DNA  was  isolated  from  liver  tissue  by  a 
Protease  K  (Sigma,  St.  Louis,  MO)/SDS  method  as  detailed  in 
Maniatis  et  al  (1982) .   After  24  hrs  of  starvation,  mice 
were  sacrificed  and  their  livers  were  surgically  removed. 
Liver  tissue  was  minced  with  scissors  and  added  to  a  mortar 
containing  liquid  nitrogen  and  ground  to  a  fine  powder.   The 
frozen  powder  was  then  added  to  4  0  ml  of  a  TES  buffer  (10  mM 
Tris  HCl,  pH  7.5;  5  mM  EDTA,  100  mM  NaCl)  containing  1%  SDS 
and  0.4  mg/ml  protease  K.   This  solution  was  incubated  at 
65°C  in  a  water  bath  for  18  hours.   DNA  solutions  were  then 
extracted  three  times  with  Tris  equilibrated  phenol  (pH 
7.5),  twice  with  a  chloroform/ isoamyl  alcohol  solution  (25:1 
v/v)  and  precipitated  with  2.5  times  the  volume  of 
isopropanol.   DNA  was  hooked  out  of  solution  with  Pasteur 
pipettes,  resuspended  in  1.0  ml  TE  buffer  (10  mM  Tris  HCl, 
pH  7.5;  1  mM  EDTA),  and  dialyzed  against  TE  buffer. 
Resulting  DNA  solutions  were  quantitated  spectophotometricly 
and  electrophoresed  on  0.7%  agarose  gels  to  confirm  their 
high  molecular  weight. 
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Endonuclease  Digestion  and 
Agarose  Gel  Electrophoresis 


Aqueous  solutions  containing  20  ug  of  high  molecular 
weight  DNA  were  digested  with  4  0  units  of  one  of  5 
restriction  endonucleases;  Bam  HI,  Bgl  II,  Eco  RI,  Pvu  II, 
and  Sac  I  for  18  hours  at  3  7°C  under  conditions  described  by 
the  supplier  (Bethesda  Research  Laboratories,  Bethesda,  MD) . 
Complete  digestion  was  confirmed  by  removing  10%  of  the 
reaction  mixture,  adding  0.5  ug  of  phage  lambda  DNA, 
incubating  an  additional  4  hours,  and  electrophoresing  on  an 
agarose  gel.   Characteristic  restriction  patterns  of  phage 
lambda  DNA  with  a  homogeneous  smear  of  genomic  DNA  were 
indicative  of  complete  digestion  of  the  DNA  mixture.   The 
remaining  digested  genomic  DNA  solution  was  electrophoresed 
on  0.7%  agarose  gels  for  2  0  hours  at  3  V/cm  in  a  water 
cooled  horizontal  electrophoresis  apparatus  (International 
Biotechnologies  Incorporated,  New  Haven,  CT) . 

Capillary  DNA  Transfer  and  Hybridization 

Restriction  endonuclease  digested  DNA  was  transferred 
from  the  gels  onto  nylon  membranes  (Zetabind,  AMF,  Meriden, 
CT)  by  the  method  of  Southern  (1980) .   The  nylon  membranes 
were  dried  in  a  vacuum  oven  at  80°C  for  3  hours  and  stored 
at  room  temperature  until  hybridization.   Membranes  were 
washed  in  a  0.1%  SSC  (0.015  M  NaCl,  0.0015  M  sodium  citrate) 
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solution  containing  0.5%  SDS  at  65°C  for  1  hour  in  a  shaking 
water  bath.   Prehybridization  and  hybridization  of  the 
filters  was  performed  as  described  by  the  supplier  (AMF, 
Meriden,  CT)  .   Membranes  were  hybridized  with  -^^p.^abeled 
DNA  probes  of  specific  activity  of  approximately  2  x  10^ 
dpm/ug  by  nick  translation  (Bethesda  Research  Laboratories, 
Bethesda,  MD)  for  18  hours  at  42°C.   Non-specif ically  bound 
probe  was  removed  by  two  successive  washes  in  0.1%  SSC/0.1% 
SDS  at  65°C  in  a  shaking  water  bath.   Membranes  were  exposed 
for  2-5  days  to  XAR-5  X-ray  film  (Kodak,  Rochester,  NY)  with 
Lightning  Plus  intensifying  screens  (Dupont,  Wilmington, 
DE) .   After  autoradiography,  hybridized  probes  were  removed 
by  washing  with  0.1%  SSC/0.5%  SDS  at  80°C  for  20  minutes  and 
sequentially  rehybridized  with  other  labeled  probes. 

RNA  Isolation  and  Analysis 

Total  cellular  RNA  was  isolated  from  mouse  splenocytes 
with  guanidine  isothiocyanate  (International  Biotechnologies 
Inc.,  New  Haven  CT)  as  described  (Chirgwin  et  al.  1979; 
Dingier  et  al  1986) .   Isolated  spleen  cells  were  dissolved 
in  4  M  guanidine  isothiocyanate  and  layered  onto  a  buffer  of 
5.7  M  cesium  chloride  (Bethesda  Research  Laboratories, 
Bethesda,  MD)  and  centrifuged  at  2  0,000  rpm  for  18  hours. 
The  supernatant  was  aspirated  and  the  precipitated  RNA  was 
resuspended  in  sterile  DEPC-treated  water,  phenol  and 
chloroform  extracted  twice,  and  ethanol  precipitated.   RNA 
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solutions  were  stored  as  precipitates  in  100%  ethanol  at 
-2  0°C.   The  RNA  was  quant itated  spectrophotometrically  and 
10  ug  were  electrophoresed  in  1.0%  forma ldehyde\agarose  gels 
to  check  for  degradation.   RNAs  that  were  intact  were 
blotted  to  nylon  membranes  and  hybridized  with  specific 
probes  as  described  in  the  above  section.   These  filters 
were  probed  with  a  7  00  bp  cDNA  for  the  Ea  gene. 

Probes 

Seven  single  copy  DNA  probes  were  isolated  from  I 
region  cosmid  clones  supplied  by  Dr.  Michael  Stienmetz, 
except  where  otherwise  noted,  which  are  evenly  spaced  across 
the  I  region  and  flanking  the  known  recombinational 
hotspots.   Figure  3-1  shows  the  location  of  these  probes 
with  relation  to  the  genes  within  the  I  region.   Probe  1  is 
a  5.8  kb  Eco  RI  fragment  containing  the  entire  A/3^  gene 
(Malissen  et  al .  1983),  probe  2  is  a  1.2  kb  Hind  III 
fragment  containing  the  al  and  a2  exons  of  Aa"  (J.  Seidman, 
personal  communication  1984),  probe  3  is  a  4.2  kb  Eco  RI/ 
Xho  I  fragment  midway  between  ha   and  E/3,  probe  4  is  a  1.8  kb 
Eco  RI  fragment  containing  the  {31   exon  of  E;S*^,  probe  5  is  a 
700  bp  cDNA  containing  the  (S2 ,    TM,  CTY,  and  3'UT  portions  of 
the  Ej8^  gene,  probe  6  is  a  4.5  kb  Bam  HI  fragment  containing 
the  Ej82  psuedogene,  and  probe  7  is  a  6.5  kb  Bam  HI  fragment 
containing  the  5'  promoter  region  of  the  Ea*^  gene. 
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Genomic  Restriction  Mapping 

Genomic  restriction  maps  were  determined  for  the 
pertinent  strains  by  first  determining  the  restriction  map 
of  the  different  I  region  cosmids  with  the  5  restriction 
endonucleases  used  in  this  study.   Likely  strains  showing 
recombination  were  double  digested  with  the  restriction 
enzymes  and  fragment  sizes  determined  after  blotting  and 
hybridization  with  a  probe  for  the  locus  of  interest.   These 
fragments  were  then  used  to  map  the  restriction  endonuclease 
sites  for  each  strain. 

Data  Analysis 

A  restriction  fragment  length  polymorphism  (RFLP) 
analysis  was  performed  on  the  data  using  equation  21  of  Nei 
and  Li  (1979)  where: 

F  =  2nxy  /  (n^   +   Hy) 
in  which  n^  and  ny  are  the  number  of  fragments  from  alleles 
X  and  y,  respectively,  and  nj^y  is  the  number  of  shared 
fragments  between  the  alleles.   A  pairwise  correlation 
analysis  of  the  F  values  obtained  in  the  RFLP  analyses  was 
performed  for  adjacent  loci  using  a  computer  program  from 
SAS  (Statistical  Analysis  Software)  where  a  cumulative 
correlation  coefficient  (R)  was  calculated  for  each  allele 
as  compared  to  all  other  alleles  sampled. 


CHAPTER  IV 
RESULTS 


RFLP  Analysis  of  the  I  Region 

An  RFLP  analysis  of  the  Aj3  gene  was  done  previously 
(McConnell  et  al.  1986,  1988).   Briefly,  a  5.5  kb  k^ 
fragment,  probe  1,  identified  twenty-one  alleles  with  an 
average  F  value  of  0.29  whithin  this  panel  of  mice.   This 
high  degree  of  divergence  (low  F  value)  was  found  to  be  due 
to  a  retroposon  insertion  within  the  second  intron.   This 
insertion  separated  the  alleles  into  three  evolutionary 
groups  based  on  either  the  absence  of  a  retroposon,  the 
presence  of  a  851  bp  retroposon,  or  the  presence  of  a  1.1  kb 
retroposon  insertion.   When  these  retroposon  polymorphisms 
were  taken  into  account  during  the  RFLP  analysis,  the  mean  F 
value  rose  to  0.64. 

Probe  2,  a  1 . 1  kb  Aa  fragment  containing  the  al  and  a2 
axons,  shows  a  similar  high  degree  of  polymorphism  as  seen 
with  A/3,  and  a  high  number  of  alleles  (Table  4-1).   These 
results  show  that  the  degree  of  diversity  detected  by  each 
restriction  enzyme  may  vary.   By  using  a  combination  of 
several  enzymes,  the  relative  diversity  between  alleles  can 
be  better  estimated.   The  data  in  Table  4-1  also  show  the 
frequency  of  an  allele  within  this  panel  of  mice.   For 
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example,  seven  strains  carry  the  a  allele  at  this  locus 
whereas  only  one  strain  carries  the  g  allele.   Another  point 
obtained  from  these  data  is  the  degree  of  relatedness 
between  particular  alleles  such  that  they  represent  minor 
variants  of  each  other  which  differ  by  only  a  single 
restriction  fragment,  i.e.  allele  b  and  allele  g  by  a  single 
Bam  HI  fragment.   The  F  values  calculated  between  alleles 
are  shown  in  Table  4-2,  and  show  there  are  six  fragments  for 
each  allele  for  the  five  enzymes,  with  each  pair  of  alleles 
sharing  from  two  to  ten  fragments.   The  result  of  the  RFLP 
analysis  is  depicted  in  Figure  4-1,  where  probe  2  identifies 
twelve  alleles  with  a  mean  F  value  of  0.49  ±  0.18.   All 
polymorphisms  at  this  locus  are  due  to  restriction  enzyme 
site  changes  rather  than  due  to  insertions  or  deletions  as 
determined  by  restriction  fragment  length  comparisions. 

Probe  3 ,  a  4.2  kb  fragment  midway  between  Aa  and  Efi   and 
designated  intergenic  sequence  1  (II) ,  identified  thirteen 
alleles  (Table  4-3)  in  which  nineteen  of  the  twenty-eight 
mouse  strains  can  be  grouped  into  the  first  four  allelic 
lineages.   This  probe  detects  six  to  seven  fragments  per 
allele  as  seen  in  Table  4-4,  with  these  alleles  showing  an 
average  F  value  of  0.47  ±  0.20  (Figure  4-1).   There  are  no 
discernable  insertion  or  deletion  polymorphisms  when  the 
restriction  fragment  lengths  are  compared  at  this  locus. 

The  probe  for  the  5'  portion  of  the  Ej9  gene,  probe  4, 
identifies  nine  alleles  (Table  4-5)  in  which  there  are  only 
three  allelic  groups  without  multiple  members,  and  one 


59 

Table  4-1.  RFLP  sizes  and  allelic  grouping  of  strains  for  Aa. 

allele    Bam  HI    Bgl  II    Eco  RI    Pvu  II    Sac  I       strains 


a       b 
a      5.2      3.5      12.0      7.0 


b  5.4  4.8  12.0  7.0 

c  5.4  5.2  12.0  7.0 

d  5.2  4.8  12.0  7.0 

e  5.4  4.8  12.0  4.0 

f  5.4  4.8  12.0  6.0 

g  5.2  4.8  12.0  7.0 

h  5.4  4.8  12.0  7.0 

i  5.2  5.2  12.0  4.0 

j  5.2  6.5  12.0  7.0 

k  5.4  3.5  12.0  7.0 

1  5.2  4.8  12.0  7.0 


13.7 

BIO.F,  BIO.Q, 

6.6 

B10.KEA5,  MEr2, 

B10.SAA48, 

B10.CAA2 

11.1 

W12A,  SIU,  FAI4, 

6.6 

MBr3,  FAI5 

8.0 

BIO,  JEH3,  AZRl, 

6.0 

B10.STC90 

8.0 

BIO.WB,  Jl::k4 

4.5 

13.7 

MhTl,  BIO.RIII 

6.6 

11.1 

BIO. BR,  B10.CHA2 

4.5 

11.1 

BIO.S 

6.6 

13.7 

BIO.SM 

6.6 

10.5 

FAI3 

6.0 

13.7 

BIO. PL 

6.6 

8.7 

B10.D2 

6.6 

8.0 

BIO.M 

6.0 

^Allele  designations  in  ascending  alphabetical  order  based  on  frequency. 
Walues  are  expressed  in  kilobases. 
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Table  4-2.  RFIP  analysis  of  Aa  alleles. 


allele 


b 

a 

- 

6 

4 

6 

6 

2 

8 

8 

4 

10 

8 

6 

12 

12 

12 

12 

12 

12 

12 

12 

12 

12 

12 

a 

b 

.50 

- 

6 

6 

8 

8 

10 

10 

2 

6 

8 

6 

12 

12 

12 

12 

12 

12 

12 

12 

12 

12 

c 

.33 

.50 

- 

6 

4 

4 

4 

6 

6 

4 

6 

8 

12 

12 

12 

12 

12 

12 

12 

12 

12 

d 

.50 

.50 

.50 

- 

4 

6 

8 

6 

4 

6 

4 

10 

12 

12 

12 

12 

12 

12 

12 

12 

e 

.50 

.67 

.33 

.33 

- 

6 

6 

10 

4 

6 

6 

4 

12 

12 

12 

12 

12 

12 

12 

f 

.17 

.67 

.33 

.50 

.50 

- 

6 
12 

6 
12 

2 

12 

2 
12 

4 
12 

4 
12 

g 

.67 

.83 

.33 

.67 

.50 

.50 

- 

8 
12 

4 
12 

8 
12 

6 
12 

8 
12 

h 

.67 

.83 

.50 

.50 

.83 

.50 

.67 

- 

2 
12 

8 
12 

8 
12 

6 
12 

i 

.33 

.17 

.50 

.33 

.33 

.17 

.33 

.17 

- 

4 
12 

2 

12 

6 
12 

J 

.83 

.50 

.33 

.50 

.50 

.17 

.67 

.67 

.33 

- 

6 
12 

6 
12 

k 

.67 

.67 

.50 

.33 

.50 

.33 

.50 

.67 

.17 

.50 

- 

4 
12 

1 

.50 

.50 

.67 

.83 

.33 

.33 

.67 

.50 

.50 

.50 

.33 

- 

^The  fraction  homologous  (F)  value  as  defined  by  Nei  and  Li  (1979) . 
^^^Number  of  shared  RF  /  total  RF  for  both  alleles,  based  on  restriction 
digestion  with  Bam  HI,  Bgl  II,  Eco  RI,  Pvai  II,  and  Sac  I. 
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Table  4-3.  RFLP  sizes  and  allelic  grouping  of  strains  for  II. 

allele    Bam  HI    Bgl  II    Ecx)  RI    Pvu  II    Sac  I       strains 

a        b 
a       10.0      5.0       9.0       2.0       9.0       BIO,  BIO.SM, 

2.0       2.0  BIO.S,  AZRl, 

JER3,  METl, 
BIO.RIII,  ^ET2, 
B10.STC90,  FAI4 

10.0      5.0       6.0       2.0       9.0       BIO.Q,  B10.CAA2 

B10.STC77 
B10.CAA2 

10.0      6.0       8.5       6.0       9.0       W12A,  SIU,  FAI5 
2.0       2.0 

7.0       2.0       2.0       2.0       5.0       BIO. BR,  B10.aiA2 

0.5       1.5 

9.0       1.5       8.0       2.0       9.0       MET3 

2.0 

6.0       6.0       6.0       2.0       8.0       JER4 
1.0       1.0 

g       10.0      6.0       10.0      6.0       9.0       B10.D2 


5.0 

6.0 

2.0 

2.0 

6.0 

8.5 

2.0 

2.0 

2.0 

2.0 

0.5 

1.5 

1.5 

8.0 

2.0 

6.0 

6.0 

1.0 

1.0 

6.0 

10.0 

2.0 

5.0 

8.5 

2.0 

2.0 

6.0 

10.0 

2.0 

2.0 

5.0 

10.0 

2.0 

2.0 

6.0 

9.0 

2.0 

2.0 

1.5 

10.0 

2.0 

1.5 

9.0 

2.0 

h       7.0       5.0  8.5       2.0       9.0       BIO.WB 

2.0  2.0 

i       10.0      6.0  10.0      4.0       9.0       BIO.M 

2.0  2.0 

j       10.0      5.0  10.0      2.0       9.0       B10.SAA48 

2.0  2.0 

k       11.0      6.0  9.0       2.0       9.0       FAI3 

2.0  2.0 

1       10.0      1.5  10.0      2.0       9.0       BIO.F 

2.0 

m       7.0       1.5  9.0       2.0       9.0       BIO. PL 


^Allele  designations  in  ascending  alphabetical  order  based  on  frequency. 
Walues  are  expressed  in  kilobases. 
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Table  4-4.  RFLP  analysis  of  II  alleles. 


allele 

a 

b 

c 

d 

e 

f 

g 

h 

i 

J 

k 

1 

m 

b 

a 

- 

12 

8 

6 

6 

2 

6 

10 

8 

12 

10 

8 

6 

14 

14 

14 

13 

14 

13 

14 

14 

14 

14 

13 

13 

a 

b 

.86 

- 

8 

6 

6 

4 

6 

10 

8 

12 

8 

8 

6 

14 

14 

13 

14 

13 

14 

14 

14 

14 

13 

13 

c 

.57 

.57 

- 

4 

4 

2 

10 

6 

10 

8 

8 

6 

4 

14 

13 

14 

13 

14 

14 

14 

14 

13 

13 

d 

.43 

.43 

.29 

- 

4 

2 

2 

8 

4 

6 

6 

4 

6 

13 

14 

13 

14 

14 

14 

14 

13 

13 

e 

.46 

.46 

.31 

.31 

- 

2 

2 

6 

4 

6 

6 

8 

8 

13 

12 

13 

13 

13 

13 

12 

12 

f 

.14 

.29 

.14 

.14 

.15 

- 

2 

2 

2 

2 

4 

2 

2 

13 

14 

14 

14 

14 

13 

13 

g 

.46 

.46 

.77 

.15 

.17 

.15 

- 

4 
13 

10 
13 

8 
13 

6 

13 

6 
12 

2 

12 

h 

.71 

.71 

.43 

.57 

.46 

.14 

.31 

- 

6 
14 

10 
14 

8 

14 

6 
13 

8 
13 

i 

.57 

.57 

.71 

.29 

.31 

.14 

.77 

.43 

- 

10 
14 

8 

14 

8 
13 

4 
13 

J 

.86 

.86 

.57 

.43 

.46 

.14 

.62 

.71 

.71 

- 

8 

14 

10 
13 

6 
13 

k 

.71 

.57 

.57 

.43 

.46 

.29 

.46 

.57 

.57 

.57 

- 

6 
13 

8 
13 

1 

.62 

.62 

.46 

.31 

.67 

.15 

.50 

.46 

.62 

.77 

.46 

- 

8 
13 

m 

.46 

.46 

.31 

.46 

.67 

.15 

.17 

.62 

.31 

.46 

.62 

.62 

- 

^The  fraction  homologous  (F)  value  as  defined  by  Nei  and  Li  (1979) . 
dumber  of  shared  RF  /  total  RF  for  both  allele,  based  on  restriction 
digestion  with  Bam  HI,  Bgl  II,  Eco  RI,  EVu  II,  and  Sac  I. 
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Table  4-5.  RFLP  sizes  and  allelic  grouping  of  strains  for  5 '©3. 

allele    Bam  HI    Bgl  II    Eco  RI    Pvu  II    Sac  I      strains 

a       b 
a       8.6       3.0       2.0       3.4       5.2       BIO,  BIO.S, 

BIO.SM,  JER3, 
B10.STC90,  METl, 
MEr2,  FAI4 

b       3.8       2.0       4.1       3.4       2.5       BIO.F,  B10.C3^A2, 

B10.KEA5,  BIO.Q, 
B10.STC77 

c       4.8       2.7       2.0       3.0       5.0       B10.D2,  W12A, 

STU,  FAI5 

d       6.9       3.0       2.0       3.4       5.2       BIO. BR,  AZRl, 

B10.CHA2 

e  3.8  3.0  2.0  3.6  4.2  JER4,  BIO.WB 

f  3.8  3.0  1.0  3.4  2.5  B10.SAA48,  FAI3 

g  3.8  2.5  2.0  3.4  2.6  METS 

h  3.8  -  17.1  -  -  BIO.M 

i       8.6       3.0       2.0       3.6       5.2       BIO. PL 

^Allele  designations  in  ascending  alphabetical  order  based  on  frequency. 
Walues  are  expressed  in  kilobases. 
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Table  4-6.  RFIP  analysis  of  5'E8  alleles. 


allele 

a 

b 

c 

d 

e 

f 

g 

h 

i 

b 

a 

- 

2 

2 

8 

4 

4 

4 

0 

8 

10 

10 

10 

10 

10 

10 

10 

10 

a 

b 

.20 

- 

0 

2 

2 

6 

4 

2 

0 

10 

10 

10 

10 

10 

10 

10 

c 

.20 

0 

- 

2 

2 

0 

2 

0 

2 

10 

10 

10 

10 

10 

10 

d 

.80 

.20 

.20 

- 

4 

4 

4 

0 

6 

10 

10 

10 

10 

10 

e 

.40 

.20 

.20 

.40 

- 

4 
10 

4 
10 

2 
10 

6 
10 

f 

.40 

.60 

0 

.40 

.40 

- 

4 
10 

2 
10 

2 
10 

g 

.40 

.40 

.20 

.40 

.40 

.40 

- 

2 
10 

2 
10 

h 

0 

.20 

0 

0 

.20 

.20 

.20 

- 

0 
10 

i 

.80 

0 

.20 

.60 

.60 

.20 

.20 

0 

- 

^The  fraction  homologous  (F)  value  as  defined  by  Nei 

and  Li  (1979) . 
'dumber  of  shared  RF  /  total  RF  for  both  alleles,  based 

on  restric±ion  digestions  with  Bam  HI,  Bgl  II,  Eco  RI, 

Pvu  II,  and  Sac  I. 
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lineage  containing  thirteen  mice  of  the  H-2^  haplotype.  The 
H-2P-like  alleles  show  a  1.0  kb  deletion  when  compared  to 
the  other  alleles  in  this  panel.   This  deletion  was  analyzed 
previously  by  an  RFLP  analysis,  and  was  shown  to  have  no 
effect  on  the  transcription  or  expression  of  this  gene  in 
the  H-2P  haplotype  (Soper  et  al .  1988) .   Because  eight 
strains  share  this  polymorphism,  the  average  F  value  for 
this  locus  is  extremely  low,  0.28  ±  0.22  (Figure  4-1  and 
Table  4-6) .   There  is  also  one  other  small  100  to  200  bp 
deletion  detected  in  the  c  allele  when  compared  to  the  a 
allele  for  this  probe.   This  deletion  also  shows  no  effect 
on  the  expression  of  the  gene.   When  these  deletions  are 
taken  into  account,  the  average  F  value  rose  to  0.60.   This 
locus,  and  the  three  previous  loci,  all  map  into  what  was 
described  as  the  variable  tract  of  the  I  region  (Steinmetz 
et  al .  1984)  as  characterized  by  a  large  number  of  very 
distinct  alleles. 

Probe  5,  a  700  bp  cDNA  for  the  3'  portion  of  the  Efi 
gene,  defines  fifteen  alleles  which  are  more  closely  related 
to  one  another  than  alleles  detected  at  previous  loci 
(Tables  4-7  and  4-8)  .   Fourteen  stains  comprise  the  four 
most  frequent  alleles,  and  ten  of  the  remaining  mice 
represent  minor  variants  differing  by  a  single  restriction 
fragment  (RF) .   Therefore,  twenty-four  of  the  twenty-eight 
strains  can  be  catagorized  into  these  four  most  common 
alleles  which  shared  approximately  60%  of  their  restriction 
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fragments.   This  homogeneity  accounts  for  the  high  overall 
mean  F  value  of  0.57  ±  0.19  (Figure  4-1). 

The  Ey32  psuedogene,  probe  6,  shows  seven  alleles  in 
which  only  one  of  these  alleles  differs  by  more  than  a 
single  RF,  and  this  allele  is  shared  by  only  three  strains 
(Table  4-9) .   Table  4-10  demonstrates  the  relatedness  of 
these  alleles  as  reflected  in  the  extremely  high  F  values 
between  mice.   The  mean  F  value  for  this  locus  is  0.80  ± 
0.08  (Figure  4-1) . 

The  Ea  gene  is  another  very  non-polymorphic  locus  which 
is  characterized  by  probe  7 .   The  Ea  gene  shows  only  four 
alleles  which  can  be  grouped  into  two  allelic  lineages. 
Alleles  b  and  c  are  minor  variants  of  the  a  allele,  one 
major  evolutionary  lineage,  and  allele  d-like  mice  making  up 
the  second  predominant  lineage  (Table  4-11) .   Allele  a  and 
its  variants  represent  a  closely  related  family  of  alleles 
when  compared  to  the  d  lineage  when  the  F  values  generated 
in  an  RFLP  analysis  are  examined  (Table  4-12) .   In  total, 
these  data  show  that  the  Ea  alleles  can  be  separated  into 
two  classes  which  correlate  to  the  presence  of  a  650  bp 
deletion  in  the  centromeric  portion  of  the  gene.   The  allele 
a  lineage  (a,  b,  and  c)  do  not  carry  the  deletion  and, 
therefore,  can  transcribe  and  express  this  gene.   Allele  d 
carrying  mice  do  not  have  mRNA  transcribed  and  do  not 
express  an  I-E  molecule  on  their  cell  surface  (Table  4-13) . 
The  nature  of  the  defects  in  E  molecule  expression  within 
these  mice  was  examined  at  both  the  DNA  and  the  RNA  level  by 
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Table  4-7.  RFLP  sizes  and  allelic  grouping  of  strains  for  3'E/3. 

allele    Bam  HI   Bgl  II    Eco  RI    Pvu  II    Sac  I      strains 

a        b 
a      10.0     4.0      12.0 


b  10.0  3.1  12.2 

c  10.0  2.8  12.2 

d  10.0  2.8  6.0 

e  10.0  2.8  12.2 

f  10.0  4.0  12.0 

g  10,0  3.1  6.0 

h  10.0  3.1  6.0 

i  10.0  4.6  12.2 

j  10.0  2.8  12.0 

k  8.6  4.6  12.2 

1  10.0  3.1  12.2 

m  10.0  2.8  6.0 

n  10.0  3.1  6.0 

o  10.0  4.6  5.0 


^Allele  designations  in  ascending  alphabetical  order  based  on  frequency. 
^-Values  are  expressed  in  kilobases. 


4.5 

2.6 

BIO.F,  B10.CAA2, 

3.0 

2.4 

B10.STC77,  JER4, 
B10.KEA5 

4.5 

6.4 

B10.D2,  BIO.S, 

3.0 

2.6 

FAI5,  W12A,  FAI5 

4.5 

6.4 

BIO. BR,  MLT2, 

3.0 

2.5 

FAI4 

4.5 

6.4 

B10.CHA2,  JER3 

3.0 

2.5 

4.5 

2.6 

BIO.M,  MBTl 

3.0 

2.4 

4.5 

6.4 

FAI3,  BIO.WB 

3.0 

2.6 

4.5 

6.4 

BIO.RIII 

2.7 

2.5 

4.5 

6.4 

BIO.SM 

3.0 

2.5 

4.5 

6.4 

BIO. PL 

2.7 

2.6 

4.5 

2.6 

BIO.Q 

3.0 

2.4 

5.0 

2.6 

B10.SAA48 

4.6 

2.4 

4.5 

2.6 

AZRl 

3.0 

2.4 

4.5 

6.4 

B10.STC90 

2.7 

2.6 

4.5 

6.4 

BIO 

2.7 

2.6 

4.5 

6.4 

MhT3 

3.0 

2.6 
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Table  4-8.  RFLP  analysis  of  3'E(3   alleles. 


allele 

a 

b 

C 

d 

e 

f 

g 

h 

i 

J 

k 

1 

m 

n 

o 

b 

a 

- 

8 

6 

6 

10 

12 

4 

6 

6 

12 

4 

10 

6 

6 

8 

a 
.57 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

b 

- 

10 

8 

10 

10 

4 

6 

6 

8 

4 

12 

8 

10 

10 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

c 

.43 

.71 

- 

12 

10 

8 

8 

10 

8 

8 

2 

8 

8 

6 

8 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

d 

.43 

.57 

.86 

- 

8 

8 

10 

12 

6 

8 

2 

6 

10 

8 

8 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

e 

.71 

.71 

.71 

.57 

- 

8 

4 

6 

8 

12 

6 

12 

8 

6 

8 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

f 

.86 

.71 

.57 

.57 

.57 

- 

6 

8 

8 

10 

2 

8 

8 

8 

10 

14 

14 

14 

14 

14 

14 

14 

14 

14 

g 

.29 

.29 

.57 

.71 

.29 

.43 

— 

12 

8 

4 

0 

6 

10 

12 

6 

14 

14 

14 

14 

14 

14 

14 

14 

h 

.43 

.43 

.71 

.86 

.43 

.57 

.86 

- 

6 

14 

6 
14 

0 
14 

8 

14 

8 

14 

10 
14 

6 

14 

i 

.43 

.43 

.57 

.43 

.57 

.57 

.57 

.43 

- 

6 
14 

6 

14 

8 
14 

10 
14 

10 
14 

10 
14 

J 

.86 

.57 

.57 

.57 

.86 

.71 

.29 

.43 

.43 

- 

4 
14 

10 
14 

8 
14 

6 

14 

8 

14 

k 

.29 

.29 

.14 

.14 

.43 

.14 

0 

0 

.43 

.29 

- 

6 

14 

2 

14 

2 

14 

4 
14 

1 

.71 

.86 

.57 

.43 

.86 

.57 

.43 

.57 

.57 

.71 

.43 

- 

6 
14 

8 
14 

8 

14 

m 

.43 

.57 

.57 

.71 

.57 

.57 

.71 

.57 

.71 

.57 

.14 

.43 

- 

12 
14 

8 
14 

n 

.43 

.71 

.43 

.57 

.43 

.57 

.86 

.71 

.71 

.43 

.14 

.57 

.86 

- 

8 

14 

o 

.57 

.71 

.57 

.57 

.57 

.71 

.43 

.43 

.71 

.57 

.29 

.57 

.57 

.57 

- 

fThe  fraction  homologous  (F)  value  as  defined  by  Nei  and  Li  (1979) . 
dumber  of  shared  RF  /  total  RF  for  both  alleles,  based  on  restriction  digestion 
with  Bam  HI,  Bgl  II,  Eco  RI,  Pvu  II,  and  Sac  I. 
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Table  4-9.  RFLP  sizes  and  allelic  grouping  of  strains  for  E/32. 


allele 

Bam  HI 

Bgl  II 

Eco  RI 

Pvu  II 

Sac  I 

strains 

a 

b 

a 

4.5 

4.0 

3.0 

4.0 

10.0 

B10.D2,  W12A, 

1.0 

2.5 

3.0 

1.0 

B10.CHA2,  BIO.S, 

0.5 

BIO.Q,  B10.C3^A2, 
BIO.SIU;?,  BIO.M 
BIO. PL,  SIU, 
JEK4,  AZRl,  FAI3 
FAI5,  BIO.SM, 
BIO.F,  B10.KEA5 

b 

4.5 

4.0 

3.0 

3.5 

10.0 

BIO.RIII,  MhTJ, 

1.0 

2.5 

3.0 

1.0 

B10.STC90 

0.5 

c 

4.5 

4.5 

3.0 

4.0 

10.0 

MhT2,  FAI4, 

1.0 

2.5 

0.5 

3.8 

1.0 

BIO.WB 

d 

4.5 

4.0 

3.0 

4.0 

10.0 

JEK3,  METl 

1.0 

2.5 

3.8 

1.0 

0.5 

e 

4.5 

4.2 

3.0 

3.5 

10.0 

BIO 

1.0 

2.5 

3.0 

1.0 

0.5 

f 

4.5 

4.0 

3.0 

4.0 

9.0 

B10.SAA48 

1.0 

2.5 

3.0 

1.0 

0.5 

g 

4.5 

4.0 

3.0 

5.5 

10.0 

BIO.BK 

1.0 

2.5 

4.0 

1.0 

0.5 

^Allele  designations  in  ascending  alphabetical  order  based  on  frequency. 
'-Values  are  expressed  in  kilobases. 
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Table  4-10.  RFIP  analysis  of  ©32  alleles. 


allele 

a 

b 

c 

d 

e 

f 

g 

b 

a 

- 

18 

18 

16 

16 

18 

18 

20 

20 

20 

20 

20 

20 

a 

b 

.90 

- 

14 

16 

16 

16 

16 

20 

20 

20 

20 

20 

c 

.90 

.70 

- 

18 

14 

14 

16 

20 

20 

20 

20 

d 

.80 

.80 

.90 

- 

14 
20 

16 
20 

18 
20 

e 

.80 

.80 

.70 

.70 

- 

14 
20 

14 
20 

f 

.90 

.80 

.70 

.80 

.70 

- 

16 
20 

g 

.90 

.80 

.80 

.90 

.70 

.80 

- 

^The  fraction  homologous  (F)  value  as  defined 
by  Nei  and  Li  (1979) . 

dumber  of  shared  RF  /  total  RF  for  both 
alleles,  based  on  restriction  digestion 
with  Bam  HI,  Bgl  II,  Eco  RI,  Pvu  II,  and  Sac  I. 
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Table  4-11.  RFIP  sizes  and  allelic  grouping  of  strains  for  Ea. 

allele    Bam  HI    Bgl  II    Eco  RI    Pvu  II    Sac  I  strains 

a       b 
a       6.5       6.0       8.5       6.0       6.0       B10.D2,  BIO.SM, 

2.5  BIO. BR,  BIO. PL, 
1.2  B10.STC90,  MET3 
B10.CAA2,  JER4, 
B10.SAA48,  FAI5 
BIO.RIII,  BIO.Q 
BIO.M,  B10.KEA5 

b       6.5       6.0       8.5       6.0 


6.5       6.0       7.5       6.0 


6.0       5.5       8.0       5.5 


6.0 

W12A,   SlU,   AZRl, 

1.2 

JKK3,    BIO.F 

0.7 

0.4 

6.0 

B10.STC77,   FAI3, 

2.5 

BIO.WB,    B10.CHA2 

1.2 

8.0 

BIO,   BIO.S, 

2.5 

MET2,   FAI4 

^Allele  designations  in  ascending  alphabetical  order  based  on  frequency. 
"Values  are  ej^ressed  in  kilobases. 
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Table  4-12.     RFLP  analysis  of  Ea  alleles. 

allele        abed 

b 
a  -         12       12       2 

15       14        13 
a 
b  .80     -  10       0 


15        15 


.86      .67     - 


13 


.15     0  .15     - 


^The  fraction  hcanralogous   (F)   value 
as  defined  by  Nei  and  Li   (1979) . 

dumber  of  shared  RF  /  total  RF  for 
both  alleles,  based  on  restricjtion 
digestion  with  Bam  HI,   Bgl  II,   Eco  RI, 
Pvu  II,   and  Sac  I. 
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Table  4-13.  Characterization  of  E^  strains. 


strain    haplotype 


AZRl 

W201 

BIO 

b 

BIO.S 

s 

BIO.M 

f 

BIO.Q 

q 

FAI4 

W207 

MBT2 

W218 

surface 
ejq^ression 


Ea  message 
ej^ression 


deletion 
present 


+ 
+ 


+ 
+ 


^iLce  show  abarent  sized  message. 

%ice  show  lew  message  levels  due  to  a  defect  affecting  the 
stability  of  the  message. 
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RFLP  and  northern  blot  analysis.   Of  the  three  identified 
defects  in  Ea  gene  expression  (the  650  bp  deletion,  a 
splicing  defect,  or  a  message  stability  defect) ,  only  the 
deletion  mutation  is  found  in  our  panel  of  wild  mice  (Table 
4-13) .   This  deletion  is  the  only  the  major  polymorphisms 
seen  at  this  locus  and  accounts  for  the  low  mean  F  value 
(0.44  ±  0.38)  seen  between  these  mice  (Table  4-12  and  Figure 
4-1). 

Probes  5,  6,  and  7  all  map  within  the  conserved  tract 
of  the  I-region  which  is  characterized  by  the  low  number  of 
very  related  alleles  (Steinmetz  et  al.  1984) . 

Allele  Lineages 

When  two  or  more  mouse  strains  shared  an  F  value  of 
>0.80,  thus  differing  by  a  single  RF,  they  were  grouped  into 
evolutionary  lineages.   This  is  demonstrated  in  Figure  4-2 
where,  for  Aa,  there  are  five  strains  which  carry  allele  b 
with  two  minor  variants,  bvl  and  bv2 .   Each  variant  allele 
is  represented  by  a  single  member  which  differs  from  b  by  a 
unique  Bam  HI  and  Sac  I  fragment,  respectively.   The  average 
F  value  between  mice  within  a  lineage  is  0.83  whereas  the 
value  is  0.50  between  lineages.   The  number  of  lineages  for 
each  of  the  loci  probed  are  shown  in  Table  4-14.   Grouping 
the  alleles  at  each  locus  across  the  I  region  lowers  the 
number  of  distinct  alleles  two  fold.   This  helps  simplify 
the  data  for  further  analysis  and  shows  that  there  are  a 
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limited  number  of  old  alleles  carried  by  the  mice  in  this 
panel. 


Evidence  for  Site  Specific  Recombination  Within  the  I 
Region:  Identification  of  RHSs  in  EB   and  Ea 


The  lineage  designations  determined  in  the  preceding 
section  were  used  to  determine  if  and  where  recombination 
occurs  within  the  I  region  haplotypes  in  our  panel.   It  was 
expected  that  if  recombination  is  occurring  between  two 
loci,  the  pattern  of  allele  associations  of  the  recombinant 
haplotypes  will  change  with  respect  to  that  of  the  donor 
haplotypes.   This  is  confirmed  by  the  results  shown  in  Table 
4-15  when  compared  to  the  data  presented  in  Table  4-16.   The 
data  presented  in  Table  4-15  demonstrate  the  linkage 
disequilibrium  between  two  loci,  II  and  5'Ej8,  seen  in  the 
absence  of  recombination.   The  results  in  Table  4-15 
demonstrate  the  predicted  switching  of  allele  associations 
between  two  loci,  5'Ej8  and  3'E/3,  known  to  undergo 
recombination.   These  associations  are  so  pronounced  in 
haplotypes  not  undergoing  recombination  that  by  knowing  the 
lineage  of  one  locus,  the  lineage  of  the  adjacent  locus  can 
be  predicted  accurately  in  9  0%  of  the  cases.   For  example, 
referring  to  Table  4-16,  if  a  strain  belongs  to  lineage  a 
for  II,  then  in  nine  of  ten  cases  these  same  mice  are 
lineage  a  for  5'E;3.   This  same  pattern  holds  true  for  all 
other  lineages.   In  contrast,  especially  for  lineage  a,  this 
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Table  4-14.  Allele  Lineages  Across  the  I  region. 


strain 

hP 

Act 

11 

5'E<3 

3'B3 

©32 

Ea 

BIO 

h 

c 

a 

a 

gv2 

av3 

d 

BIO. BR 

P 

f 

d 

d 

d 

av5 

a 

B10.CAA2 

a 

a 

avl 

b 

a 

a 

a 

B10.CHA2 

P 

f 

d 

d 

dvl 

a 

av2 

B10.D2 

cvl 

k 

cvl 

c 

b 

a 

a 

BIO.F 

a 

a 

avl 

b 

a 

a 

avl 

B10.KEA5 

a 

a 

avl 

b 

a 

a 

a 

BIO.M 

1 

dvl 

i 

h 

cvl 

a 

a 

BIO. PL 

q 

avl 

m 

avl 

i 

a 

a 

BIO.Q 

a 

a 

avl 

b 

avl 

a 

a 

BIO.RIII 

c 

e 

a 

a 

av2 

avl 

a 

BIO.S 

m 

bvl 

a 

a 

b 

a 

d 

B10.SAA48 

bvl 

a 

av2 

f 

k 

av4 

a 

BIO.SM 

d 

bv2 

a 

a 

av3 

a 

a 

B10.STC77 

a 

a 

avl 

b 

a 

a 

av2 

B10.STC90 

n 

c 

a 

a 

cv3 

avl 

a 

BIO.WB 

i 

d 

h 

e 

av4 

c 

av2 

AZRl 

h 

c 

a 

d 

1 

a 

avl 

FAI3 

1 

i 

k 

f 

av4 

a 

av2 

FAI4 

jvl 

b 

a 

a 

d 

c 

d 

FAI5 

jvl 

b 

c 

c 

b 

a 

a 

JER3 

h 

c 

a 

a 

dvl 

a(v2 

avl 

JEk4 

i 

d 

f 

e 

a 

a 

a 

I'lhTl 

c 

e 

a 

a 

cvl 

av2 

av2 

^'lh:i'2 

a 

a 

a 

a 

d 

c 

d 

ML'i'3 

mvl 

b 

e 

g 

o 

avl 

a 

STU 

J 

b 

c 

c 

b 

a 

avl 

W12A 

J 

b 

c 

c 

b 

a 

avl 
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Table  4-15.  Lineage  associaticais  witJiin  E^. 


strain 

5'^ 

3'E/3 

recombinant 

BIO 

a 

gv2 

^ 

BIO.RIII 

a 

av2 

+ 

BIO.S 

a 

b 

+ 

BIO.SM 

a 

av3 

+ 

B10.STC90 

a 

cv2 

- 

FAI4 

a 

d 

+ 

JEH3 

a 

dvl 

+ 

man 

a 

cvl 

- 

Mh'i'2 

a 

d 

+ 

BIO. PL 

avl 

i 

— 

BIO.F 

b 

a 

. 

B10.KEA5 

b 

a 

- 

BIO.Q 

b 

avl 

- 

B10.STC77 

b 

a 

- 

B10.CAA2 

b 

a 

- 

B10.D2 

c 

b 

— 

FAI5 

c 

b 

- 

STU 

c 

b 

- 

W12A 

c 

b 

- 

BIO. BR 

d 

d 

• 

B10.CHA2 

d 

dvl 

^ 

^Reccaribinants  defined  by  switching  of  lineage 
association  between  loci. 
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Table  4-16.  Lineage  associations  centromeric  of  E(3. 


strain 

11 

5'S3 

recombinant 

BIO 

a 

a 

BIO.RIII 

a 

a 

— 

BIO.S 

a 

a 

— 

BIO.SM 

'    a 

a 

. 

B10.STC90 

a 

a 

_ 

AZRl 

a 

d 

+ 

FAI4 

a 

a 

. 

JEk3 

a 

a 

— 

METl 

a 

a 

— 

MhTi 

a 

a 

- 

B10.CAA2 

avl 

b 

.- 

BIO.F 

avl 

b 

- 

B10.KEA5 

avl 

b 

-. 

BIO.Q 

avl 

b 

. 

B10.STC77 

avl 

b 

~ 

B10.SAA48 

av2 

f 

- 

FAI5 

c 

c 

_ 

W12A 

c 

c 

— 

STO 

c 

c 

. 

B10.D2 

cvl 

c 

- 

BIO. BR 

d 

d 

^ 

B10.CHA2 

d 

d 

— 
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predictive  ability  is  lost  due  to  the  high  degree  of 
recombination  that  occurs  between  5'  and  3'Ej8  (Table  4-15). 
All  recombination  events  at  this  site  occur  between  an  a 
allele  at  5'Ej8  and  some  other  allele.   Six  recombination 
events  are  scored  at  this  site  as  compared  to  only  one 
possible  event  between  II  and  5'EJ3   in  all  the  mouse 
haplotypes  tested.   This  is  therefore  designated  a 
recombinational  hotspot  (RHS)  based  on  the  high  frequency  of 
localized  recombinational  events  specific  for  the  a  lineage. 
These  recombinational  events  can  be  represented  graphically 
(Figure  4-3A) ,  where  the  fill  pattern  at  each  locus 
represents  the  lineage  origin  of  the  genomic  segment  of 
interest.   This  diagram  represents  the  identification  of  a 
RHS  within  Ej0  for  mice  with  the  w22 .  w26,  w207 .  and  s 
haplotypes. 

An  similar  analysis  was  performed  on  the  genomic 
segment  containing  the  Ej82  and  Ea   genes.   Because  of  the 
lower  degree  of  polymorphism  at  the  telomeric  end  of  the  I 
region,  the  650  bp  deletion  in  Ea  is  used  as  a  marker  in 
order  to  identify  recombination  within  this  region. 
Recombination  is  scored  when  a  deleted  allele  of  Ea  is 
observed  adjacent  to  a  proximal  expressor  associated  allele 
for  Ej32  (Figure  4-3B)  .   Of  the  three  possible  lineages  of 
E)02  and  two  lineages  at  Ea,  three  recombinant  mouse 
haplotypes  can  be  identified;  b,  s,  and  ±. 
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Both  of  these  RHSs  correspond  to  the  hotspots  previously 
identified  in  laboratory  inbred  strains  (Steinmetz  et  al. 
1982b;  Lafuse  et  al.  1986) . 


Identification  of  a  Recombinational  Hotspot 
Between  Aa  and  E0 


By  the  same  methods  and  criteria  as  above,  a  third  RHS 
was  identified  between  Aa  and  E/9.   This  recombinational 
hotspot  maps  to  a  4.7  to  9.2  kb  stretch  of  DNA  midway 
between  the  two  genes  (Figure  4-4) .   The  minimum  distance 
was  defined  at  centromeric  end  by  a  Bgl  II  site  in  w207  and 
telomeric  end  by  a  Pvu  II  site.   These  sites  were  confirmed 
with  a  2.8  kb  Bam  HI  fragment  probe  which  lies  equidistantly 
between  the  Aa  and  II  probes.   Five  recombinational  events 
were  scored  at  this  site  and  the  data  for  some  of  the 
representative  haplotypes  (s,  w207.  and  w218)  are  presented 
in  Figure  4-3C.   All  recombinational  events  occur  between 
mice  of  the  a  lineage  for  II,  and  usually  strains  of  the  b 
lineage  for  Aa  and  therefore  are  haplotype  specific.   This 
RHS  represents  a  site  for  homologous  recombination  not 
previously  identified  in  laboratory  strains  of  Mus  m. 
domesticus. 
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Identification  of  Recombinationally  Depressed 
Segments  (REDS)  Within  the  I  Region 


By  lining  up  the  haplotypes  in  this  panel  as  done  for 
the  identification  of  RHSs,  a  pattern  of  associations 
between  loci  emerged.   It  was  observed  that  recombination  is 
localized  almost  exclusively  to  the  three  recombinational 
hotspots  with  one  event  or  less  occurring  elsewhere  within 
the  I  region.   This  restricted  pattern  of  recombination  at 
specific,  non-random  sites  separates  the  I  region  into 
discrete  genomic  segments  flanked  by  the  RHSs.   There  is 
strong  linkage  disequilibrium  seen  between  the  loci  within 
any  of  these  three  genomic  segments.   Depending  on  the  RHSs 
active  within  a  particular  haplotype,  the  extent  of  this 
linkage  disequilibrium  varies  (Figure  4-5) .   Some  haplotypes 
carry  linkage  groups  extending  across  the  entire  I  region, 
for  example  H-lP ,    while  others,  with  more  active  forms  of 
the  different  RHSs,  show  shorter  linkage  groups.   For 
example,  in  the  H-2''*^^Q^  haplotype,  Aj3   and  Aa  form  one 
linkage  group  while  II  and  5'E/3  form  another  short  linkage 
group.   This  pattern  of  recombination  within  the  I  region 
defined  by  RHSs,  led  these  genomic  segments  flanked  by  RHSs 
to  be  defined  as  recombinationally  depressed  segments 
(REDS) .   These  REDS  are  characterized  by  the  lack  of 
recombination  between  the  loci  within  a  particular  REDS  and, 
therefore,  a  strong  linkage  disequilibrium  is  exhibited 
between  these  loci.   Recombinationally  depressed  segment  1 
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contains  44  kb  of  DNA  including  the  genes  for  A/3  and  Aa, 
REDS2  contains  19  kb  extending  just  teloineric  of  Aa  to  the 
RHS  in  Ej8,  REDS3  spans  21  kb  containing  3'EJ3,    E)32,  and  9kb 
telomeric  to  the  RHS  at  Ea,  and  REDS4  contains  Ea  and 
approximately  15  kb  telomeric  of  this  gene.   These  REDS  are 
the  genetic  units  exchanged  between  haplotypes  by  homologous 
recombination  during  the  evolution  of  the  genus  Mus  which 
led  to  the  diversification  of  the  I  region. 

Correlation  Analysis  of  REDS 

To  test  how  strong  the  linkage  disequilibrium  is 
between  loci  within  a  REDS  and  across  the  I  region,  a 
statistical  pairwise  correlation  analysis  of  the  F  values 
for  each  pair  of  neighboring  loci  was  performed.   The 
correlation  coefficient  or  R  values  were  calculated.   These 
show  the  relationship  between  the  degree  of  divergence  (F 
value)  for  the  neighboring  loci.   For  example,  by  assessing 
the  degree  of  divergence  of  one  locus,  A,  in  a  particular 
haplotype  as  compared  to  a  certain  subset  of  haplotypes,  if 
there  is  correlation  between  this  locus  and  its  neighbor,  B, 
then  an  analogous  pattern  of  divergence  or  F  value  will  be 
seen  for  locus  B  as  compared  to  the  same  subset  of 
haplotypes  (Figure  4-6) .   As  the  strength  of  the  correlation 
between  loci  increases,  the  R  value  approaches  1.0.   Table 
4-17  shows  the  cummulative  results  of  this  type  of  analysis 
for  all  the  loci  within  the  I  region.   The  center  diagonal 
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Table  4-17.     Correlaticxi  analysis  of  loci  acaxiss  the  I  region. 


locus  hfi 
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3 'S3 


©32 
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A^ 
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3'B8 
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0.52 
,0001^ 


0.18 
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0.09 

0.02 

-0.11 

.0001 
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.7009 
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0.18 
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0.03 

-0.04 
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.0001 

.4449 
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.0059 

— 

0.46 
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-0.18 

-0.21 

.0001 

.0170 

.0001 

.0001 

— 

0.13 

-0.11 

-0.19 

.0006 

.0036 

.0001 

0.51 
.0001 

-0.04 
.3557 

0.25 
.0001 

^The  comelaticai  coefficient,  R. 

'-^The  prcbabilit/  of  R  under  the  Ho;  Rho=0. 


m 


si 

-p 

tr 

•H 

c 

Ul 

0 

0) 

0) 

O 

u 

3 

9 

rH  -P  rH 

G 

(Q 

as 

0 

U 

> 

•H 

o 

<U 

0^<t-l  43  H 

0) 

■P  rH 

^ 

g 

10 

< 

H|+J  rt! 

-p 

• 

a) 

0 

o 

X!^ 

• 

• 

+J 

C  H 

0) 

0 

W  J3 

•H  XJ 

m  -P  -P 

0 

0 

(0 

(0 

^^  -P 

C 

0 

• 

0 

IC 

•H 

M   r-^ 

R) 

•§ 

fX  0) 

c 

a.> 

•H 

0) 

0 

rtj 

0) 

0 

> 

0 

rH 

0 

•H 

a) 

(fl 

r-\ 

tJl  ^4 

0) 

0) 

3 

o 

tP  (U 

>irH 

c 

c 

M  A 

nj 

0) 

•H 

Id 

>'T3 

^ 

TJ 

•H 

0 

,-« 

(!)««« 

ja 

M  +J 

c 

J3 

0) 

(0 

0) 

0 

0^  :i 

Mx: 

0 

•rH 

rH 

«j  +J 

0) 

(0 

a 

H 

c 

> 

0) 

•. 

o 

(0 

Ul 

o 

■w  Pd 

0) 

o 

0 

>-' 

•H 

(0 

• 

0 

(0 

o 

c 

(0 

0 

0) 

0  -P  H 

M 

0) 

•H 

c 

0  X! 

-P 

0) 

u 

C  +J 

(0 

-H 

0 

•H 

rH 

0  MH 

0 

0) 

•H 

C  -P 

>H  m  -o 

0 

U  "M 

c 

■H  +J 

0 

0) 

(0  +J 

c 

u 

0 

(0 

rO 

0 

W   rH 

0 

Q 

0) 

•rH 

• 

C  M 

U  <w 

t^ 

0  « 

v^ 

-H 

1 

•H 

0 

c 

■<* 

■P 

(d 

0 

-H 

Q)  rH 

c 

0) 

10 

U 

0) 

•H  J3 

3 

^<  x:  -P 

0) 

0>  M  +J 

^ 

-H 

o 

-H  <(H 

0) 

b  U 

> 

o 

^ 

97 


m 


CO 

CO 
Q 
LU 


LU 


CM 

CO- 
LLI 


It 

H 


m 


DC 


d 


lO 

d 


Csj 

CO 
Q 
LU 


cka. 

LU 


II 


CO 


(O 


< 


IF 


CO 

d 


CO 

o 

LU 
CC 


ir 


(0 

o 

> 


01 


c 
o 

i  I 
1  1 


8 

d 

o. 

I 

* 


98 
line  of  values  is  for  adjacent  loci.   These  results  are 
represented  graphically  in  Figure  4-7.   Loci  within  a  REDS 
show  a  significant  high  correlation,  on  average  0.50, 
whereas  loci  flanking  RHSs,  therefore  not  within  the  same 
REDS,  show  very  low  correlation  of  around  0.15.   These  data 
are  of  twofold  importance:   (1)  they  confirm  the  location  of 
the  RHSs,  and  (2)  they  give  insight  into  the  relationships 
between  the  loci  contained  within  a  particular  REDS.   These 
relationships  are  twofold:   (1)  the  genes  within  a  REDS  show 
a  concerted  evolution,  i.e.  they  accumulate  mutations  in  a 
coordinate  manner,  and  (2)  the  genomic  sequences  that 
constitute  a  REDS  tend  to  diverge  as  a  single  genetic  unit. 

Lineage  Analysis  of  REDS 

Because  of  the  coordinate  evolution  of  the  genes  within 
a  REDS  leading  to  the  divergence  of  these  genomic  segments 
as  a  single  genetic  unit,  they  were  analyzed  as  if  they  were 
a  single  gene.   F  values  were  calculated  for  each  of  the 
three  REDS  for  all  the  haplotypes  examined.   These  values 
were  used  to  examine  the  lineage  relationship  between  REDS 
much  the  same  way  as  for  the  individual  loci.   Table  4-18 
shows  the  results  of  this  analysis.   For  REDSl,  twenty-three 
of  the  twenty-eight  haplotypes  fall  into  three  major 
evolutionary  lineages  which  leaves  only  five  unique  forms  of 
REDSl.   Haplotypes  within  a  particular  REDS  lineage  show  an 
average  F  value  of  0.80  while  haplotypes  amoung  separate 
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Table  4-18.  I  region  diversity  as  defined  by  REDS. 


REDS     lineages   F  values p  value 

within  between 


1  11      0.80  ±  0.02    0.39  ±  0.01    «  0.001 

2  10      0.83  ±  0.04    0.36  ±  0.02    «  0.001 

3  4      0.80  ±  0.01    0.69  ±  0.01     <  0.001 
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lineages  have  F  values  less  than  0.39.   Recombinationally 
depressed  segment  2  shows  an  analogous  pattern  where 
twenty-six  of  the  strains  are  divided  amoung  four  major 
evolutionary  lineages  with  an  F  value  of  0.83  within  a 
particular  lineage,  and  0.36  between  lineages. 
Recombinationally  depressed  segment  3,  in  the  conserved 
tract  of  the  I  region,  shows  a  lower  number  of  more  related 
forms  analogous  to  that  seen  for  the  individual  loci  within 
this  tract.   There  are  only  two  forms  of  REDS3  which  share 
69%  of  their  restriction  fragments. 

Haplotvpe  Characterization  of  t  Forms  of  Chromosome  17 

The  same  analyses  done  in  the  wild  type  mice  were  also 
carried  out  on  t-bearing  mice.   There  are  only  seven  t- 
bearing  mouse  strains  in  this  analysis  as  opposed  to  the 
twenty-eight  strains  of  wild  type  mice  examined  above  (refer 
to  Table  3-1) .   The  results  of  the  allele  and  F  value 
analyses  are  shown  in  Figure  4-8.   Although  the  allele  and 
cluster  number  are  lower  for  the  t  haplotypes,  they 
correlate  well  with  what  is  seen  for  the  wild  type  mice. 
The  F  values  are  very  similar,  if  not  identical,  to  those 
calculated  in  the  wild  type  mice.   The  I  region  of  the  t 
haplotypes  is  also  divided  into  conserved  and  polymorphic 
tracts  as  seen  with  the  wild  type  strains. 

There  are  a  large  number  of  wild  type  alleles  seen  in 
the  t  haplotypes.   A  detailed  analysis  of  the  RFLP  alleles 
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Table  4-19.  RFIP  sizes  and  allelic  grouping  of  t  haplotypes  for  Aa. 

strain    Bam  HI    Bgl  II    Eco  RI    Pvu  II    Sac  I    identical  allele 


a 

t6       5.2       4.8       12.0      7.0       8.0       d,  twl2,  tw71 

4.5 

tw5       5.4       6.5       12.0      7.5       13.7      * 

2.0       6.6 

tw8       5.4       4.8       12.0      4.0       13.7      e,  tw32 

6.6 

tw75      5.4       5.2       12.0      7.0       8.0       a 

6.0 

^Values  ej^jressed  in  kilc±)ases.   Unique  to  t  haplotypes. 

Table  4-20.  RFLP  sizes  and  allelic  grouping  of  t  haplotypes  for  II. 

strain    Bam  HI    Bgl  II    Eco  RI    Pvu  II    Sac  I    identical  allele 

a 
t6        10.0      6.0       8.5       2.0       8.0       *,  twl2,  tw71 


a 

10.0 

6.0 

8.5 

1.0 

2.0 

10.0 

5.0 

9.0 

2.0 

2.0 

tw5      10.0      5.0      9.0      2.0      9.0      a,  tw8,  tw32, 

tw75 

^Values  expressed  in  kilobases.   Unique  to  t  haplotypes. 

Table  4-21.  RFLP  sizes  and  allelic  grouging  of  t  haplotypes  for  5 '©3. 

strain    Bam  HI    Bgl  II    Eco  RI    Pvu  II    Sac  I     identical  allele 

a 

t6        3.8       3.0       1.0       3.4       2.5       f 

tw5  8.6  3.0  2.0  3.4  5.2  a,   tw8,   tw32 

twl2  3.8  3.0  1.0  3.6  2.5  *,    tw71 

tw75  7.5  3.0  2.0  3.0  4.2  * 

^Values  ej^ressed  in  kilcijases.       Unique  to  t  haplotypes. 
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Table  4-22.  RFIP  sizes  and  eillelic  grouping  of  t  haplotypes  for  3 '©3. 

strain    Bam  HI    Bgl  II    Ecxs  RI    Pvu  II    Sac  I     identical  allele 

a 
t6       10.0      2.8      12.2 

tw5       10.0      3.1      12.2 

twl2      10.0      4.6       12.0 

^Values  expressed  in  kilobcises.   Unique  to  t  haplotypes. 

Table  4-23.  RFLP  sizes  and  eilleleic  grouping  for  t  haplotypes  for  1/32. 


4.5 

2.6 

e,  tw8,  tw32 

3.0 

2.4 

4.5 

2.6 

1,  tw75 

3.0 

2.4 

4.5 

2.6 

*,  tw71 

3.0 

2.4 

strain 

Bam  HI 

Bgl 

II 

Eco 

RI 

Pvu 

II 

Sac  I 

identical  allele 

a 

t6 

4.5 

4.0 
1.0 

3.0 
2.5 

4.0 
3.0 

10.0 
1.0 

a,  twl2,  tw71 

tw5 

4.5 

4.5 

3.0 

4.0 

10.0 

*,  tw8,  tw32. 

1.0 

2.5 

3.0 

1.0 

tw75 

0.5 

^Values  expressed  in  kilobases.   Unique  to  the  t  haplotypes. 

Table  4-24.  RFLP  sizes  and  allelic  grouping  of  t  haplotypes  for  Ea. 

strain    Bam  HI    Bgl  II    Eco  RI    Pvu  II    Sac  I     identical  allele 

a 

t6        6.5       6.0       8.5       7.5       6.0       *,  bvl,  twl2, 

1.2       tw71 
0.7 
0.4 

tw5       9.0       5.5       8.0       7.0       8.0       *,  dvl 

2.5 

tw8       6.0       5.5       8.0       7.0       8.0       d,  tw32,  tw75 

2.5 

^Values  expressed  in  kilobases.   Unique  to  the  t  haplotypes. 
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for  the  t  haplotypes  is  presented  in  Tables  4-19-24.   In 
fact,  there  are  only  six  alleles  that  are  unique  to  the  t 
haplotypes  when  all  the  loci  are  examined.   Variants  of  wild 
type  alleles  that  are  unique  to  the  t  haplotype  mice  are 
seen  but  the  majority  of  alleles  are  identical  to  the  wild 
type  haplotypes. 

Another  observation  is  that  all  the  RHSs  seen  in  the 
wild  type  haplotypes  are  active  in  the  diversification  of  I 
region  in  t-bearing  mice  as  well.   If  RHS  activity  is 
haplotype  dependent,  then  only  a  subset  of  those  seen  in  the 
wild  type  is  expected  to  be  seen  in  the  t  haplotypes.   The 
presence  of  these  RHSs  and  the  influences  they  have  on  t 
haplotype  associated  I-regions  can  be  seen  in  Figure  4-9. 
Recombination  at  the  Aa-Il  site  can  be  seen  in  tw8,  tw32 . 
and  tw5.   The  Efi   RHS  appears  to  be  active  in  tw8 .  tw32 .  t6, 
and  tw5.   Recombination  in  the  Ea  RHS  appears  occur  in  tw8 
and  twl2.   Although  not  as  apparent  at  first,  the  division 
of  the  I-region  into  distinct  REDS  can  be  seen.   By  looking 
at  the  fill  pattern  associations  within  the  expected  REDS, 
one  can  see  that  stable  associations  exist  between  the  loci. 
The  t  haplotypes  show  exactly  the  same  organization  and 
influences  acting  during  haplotype  generation  as  seen  in 
wild  type  mice. 
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CHAPTER  V 
DISCUSSION 


Polymorphism  of  Genes  Within  the  I  Region 

Mutations  in  DNA  structure  can  be  of  three  different 
forms;  point  mutations,  insertions  and  deletions.   RFLP 
analyses  are  capable  of  detecting  all  three  types  of 
mutations,  although  they  are  most  sensitive  in  detecting 
insertions  and  deletions.   Point  mutations  or  single  base 
changes  are  detected  only  when  a  mutations  of  this  types 
alters  the  recognition  sequence  of  one  of  the  restriction 
endonucleases  used  in  the  analysis.   Therefore,  RFLP 
analysis  underestimates  the  extent  of  point  mutations 
relative  to  insertions  or  deletions,  which  will  be  detected 
regardless  of  the  enzyme  used  as  long  as  the  sites  flanked 
the  mutation. 

Within  the  I  region,  three  of  the  seven  probes  identify 
regions  that  contain  insertions  or  deletions.   The  Aj9  gene 
contains  a  retroposon  insertion  in  the  second  intron  of  H-2" 
or  evolutionary  group  2  alleles  (McConnell  et  al.  1988)  and 
an  additional  insertion  adjacent  to  this  retroposon  in  the 
H-2^  or  group  3  alleles.   The  Efi   gene  contains  three 
different  insertional/deletional  events.   The  H-2^  alleles 
have  an  extra  50-100  bps  in  the  second  intron  adjacent  to 
the  132   exon  while  the  H-2^  allele  have  an  additional  200  bps 
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in  the  first  intron.   Alleles  related  to  the  H-2P  haplotype 
have  a  1.0  kb  deletion  in  the  second  intron  relative  to  the 
other  allelic  lineages.   The  third  probe  which  detects  a 
deletion  mutation  is  for  the  Ea  gene  and  shows  the  well 
documented  650  bp  deletion  in  the  5 '  portion  of  the  gene 
encompassing  the  promoter  region.   Therefore,  of  the 
seventy-six  total  alleles  for  all  the  loci  probed,  only  six 
alleles  can  be  distinguished  solely  on  the  basis  of  a  single 
gross  mutational  event  based  on  the  five  restriction  enzymes 
in  combination  with  the  probes  used  in  this  study. 

These  results  suggest  that  the  most  predominant 
mechanism  for  generation  of  diversity  at  a  locus  is  a  single 
base  change,  although  this  type  of  analysis  would 
underestimate  the  extent  of  point  mutations.   This  is  not  to 
say  that  this  is  the  only  mechanism  active  in  the 
diversification  of  the  class  II  genes  and  neighboring  DNA. 
Insertions  by  retroposon  elements,  such  as  in  A)3,  have  been 
well  documented  and  smaller  insertions  or  deletions  due 
perhaps  to  unequal  crossing  over,  such  as  seen  in  Efi ,    which 
is  known  to  contain  a  hotspot  for  recombination,  have  also 
been  seen. 

Because  RFLP  analyses  survey  mostly  non-coding 
sequences  such  as  introns  and  intergenic  stretches,  this  may 
not  be  an  accurate  measure  of  what  is  occurring  in  the 
coding  sequences.   There  is  strong  evidence  for  intragenic 
recombination  or  gene  conversion  in  the  diversification  of 
class  II  genes  (McConnell  et  al .  1988;  Mengle-Gaw  et  al. 
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1984)  which  may  account  for  the  clustering  of  mutations  in 
the  exon  when  their  sequences  are  examined.   Once  more 
alleles  are  sequenced  for  the  class  II  genes,  the  true 
magnitude  of  the  role  of  intragenic  recombination  may  be 
elucidated. 

Unequal  crossing  over  has  been  proposed  as  an 
additional  mechanism  for  the  generation  of  diversity  in  the 
I  region.   Recombination  within  the  1   region  is  of  rather 
high  fidelity  due  to  the  low  number  of  small  insertional  or 
deletion  events  seen.   Therefore,  unequal  crossing  over 
plays  a  minimal  role  in  generating  diversity  based  on  both 
the  low  incidence  around  RHSs,  and  the  low  incidence  in  exon 
sequences. 

Grouping  the  alleles  into  evolutionary  lineages  based 
on  gross  mutational  differences  helps  demonstrate  that  most 
polymorphisms  within  the  I  region  are  old  events  with  only 
minor  drift  as  seen  by  the  low  number  of  lineages  at  each 
locus  within  the  polymorphic  tract.   This  is  also  supported 
by  examining  the  loci  in  the  conserved  tract,  E/32  and  Ea, 
where  an  extremely  limited  number  of  evolutionary  forms  are 
seen  which  show  a  low  degree  of  diversity  in  comparison  to 
loci  within  the  variable  tract. 

It  has  been  previously  shown  that,  for  A^ ,    only  a 
limited  number  of  forms  of  this  gene  survived  speciation 
(McConnell  et  al.  1988)  and  this,  not  surprisingly,  appears 
to  be  the  case  for  the  entire  I  region.   These  evolutionary 
or  progenitor  haplotypes  then  underwent  mutation,  both  point 
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mutation  and  homologous  equal  recombination,  to  generate  the 
haplotypes  prevalent  in  wild  populations. 

These  same  polymorphisms  are  observed  for  the  t 
haplotypes  in  which  there  are  a  very  large  number  of  wild 
type  alleles  within  these  strains.   If  recombination  is 
suppressed  between  the  wild  type  and  t  forms  of  chromosome 
17,  the  expected  results  would  be  to  see  RFLP  alleles  which 
are  variants  of  the  wild  type  but  unique  only  to  the  t 
haplotype  mice.   The  results  obtained  suggest  that  either 
the  t  forms  of  chromosome  17  are  very  recent  mutations  from 
the  wild  type,  or,  more  probably,  that  there  is  much  more 
recombination  occurring  between  the  wild  type  and  the  t 
haplotypes  in  the  distal  region  than  previously  expected. 

Recombination  Within  the  I  Region 

Once  markers  for  the  different  lineages  of  the  loci 
within  the  I  region  were  identified,  the  relationships 
between  these  loci  could  be  examined.   There  is,  in  essence, 
one  of  two  expected  relationships  between  neighboring  loci: 
linkage  disequilibrium  or  recombination.   If  recombination 
was  not  occurring  in  the  1   region,  the  expected  outcome 
would  be  strong  associations  between  loci  across  the  entire 
I  region.   If  recombination  is  occurring  and  random,  then 
when  all  haplotypes  are  examined  together,  there  will  be  no 
linkage  seen  between  any  of  the  loci.   These  predictions 
represent  the  extremes  and  previous  reports  based  on  a 
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limited  number  of  recombinant  inbred  strains  have  shown  that 
neither  appear  to  be  the  case  (Steinmetz  et  al.  1982b,  1986, 
1987;  Lafuse  et  al.  1986).   These  studies  were,  however, 
limited  in  their  scope.   The  data  presented  here  is  based  on 
a  broad,  random  sampling  of  wild  derived  haplotypes  and  is 
therfore  more  representative  of  wild  mouse  populations. 

The  results  of  this  investigation  reveal  that 
recombination  within  the  I  region  is  not  random  and  that  it 
appears  to  be  stringently  regulated  and  restricted  to 
specific  sites.   Of  the  sixteen  recombinational  events 
scored,  six  ocur  in  Efi ,    four  in  Ea,  and  four  between  Aa  and 
Ej0  with  two  occurring  elsewhere  in  the  1    region. 
Recombination  rates  are  not  high  within  the  1   region,  being 
at  1  per  1000,  given  that  the  genetic  map  distance  of  the  I 
region  is  0.1  cM.   Studies  on  the  rate  or  frequency  of 
recombination  in  the  I  region,  based  on  screening  for 
recombinant  offspring  of  laboratory  inbred  mice,  have  shown 
that  the  numbers  of  recombinant  offspring  coincide  with  the 
predicted  frequencies  based  on  the  genetic  map  distance 
(Steinmetz  et  al.  1987) .   The  striking  feature  of 
recombination  within  the  I  region  is  its  localization  to 
recombinational  hotspots,  which  are  not  randomly  arranged 
within  this  genomic  segment. 

The  other  feature  of  recombination  of  importance  within 
the  I  region  is  the  haplotype  specificity  of  the  RHSs.   The 
data  presented  in  this  dissertation  show  that  the  activity 
of  the  RHSs  identified  in  the  wild  derived  haplotypes  in 
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this  panel  show  a  strong  haplotype  association.   All 
recombination  events  in  the  E/3  RHS  are  in  haplotypes  sharing 
a  5'  portion  related  to  the  H-2^  allele.   All  recombination 
events  at  the  Aa  to  EjS  RHS  have  an  II  RFLP  allele  related  to 
the  H-2"  haplotype.   Haplotype  specificity  for  recombination 
in  the  Ea  RHS  is  harder  to  determine  due  to  the  extreme 
relatedness  of  the  alleles  in  this  segment  of  the  I  region. 

Whether  or  not  this  pattern  of  recombination  is  unique 
to  the  I  region  or  is  ubiquitous  throughout  the  genome  is 
difficult  to  assesses  due  to  the  low  degree  or  the  lack  of 
polymorphism  at  most  other  loci.   The  only  other  system 
where  recombination  has  been  studied  in  a  mammalian  species 
is  in  the  )8-globin  locus  of  the  human.   Recombination  at 
this  locus  shows  an  analogous  pattern  of  recombination  to 
that  in  the  H-2 .  i.e.  being  localized  to  a  single  site  or 
RHS  (Orkin  and  Kazazian  1984) .   When  more  polymorphic 
markers  are  identified  for  other  loci  in  the  genome,  it  will 
be  of  interest  to  see  if  this  is  a  unique  phenomenon  or  wide 
spread  throughout  the  genome. 

Within  the  t  haplotypes,  the  same  RHSs  were  identified. 
The  predicted  results  would  have  been  to  see  a  subset  of 
these  RHSs  based  on  the  premise  that  the  activity  of  the 
founder  haplotype  RHSs  will  dictate  what  is  seen  in  the 
subsequent  descendants  of  that  strain.   Because  many  of  the 
same  RFLP  alleles  are  detected  in  the  t  haplotypes  as  seen 
in  the  wild  type,  there  must  be  a  higher  rate  of 
recombination  between  the  two  forms  of  chromosome  17  than 
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previously  expected.   Similarly,  only  a  subset  of  the  RHSs 
active  in  the  diversification  of  the  I  region  in  the  wild 
type  would  be  expected  to  be  seen  in  the  t-bearing  mice. 
The  data  presented  here  show  the  presence  of  all  the  active 
RHSs  seen  in  the  wild  type  and,  therefore,  further  support 
the  idea  that  recombination  rates  are  higher  between  t  and 
the  wild  type  than  expected. 


The  Influence  of  RHSs  on  the  Evolution  of  I  Region 
Haplotypes  at  the  Genomic  Level 


As  eluded  to  in  the  preceding  section,  lack  of 
recombination  can  lead  to  linkage  disequilibrium  or  vice 
versa.   Therefore,  what  is  seen  in  the  relationship  between 
two  loci  within  a  REDS?   As  shown  in  the  last  chapter,  loci 
within  a  REDS  show  a  strong  linkage  disequilibrium.   These 
loci  remain  linked  regardless  of  their  haplotype. 
Recombination  occurs  at  the  RHSs  leading  to  the  shuffling  of 
these  linked  loci  between  haplotypes. 

These  REDS  represent  discrete  genetic  units  in  which 
the  genes  have  evolved  in  a  coordinate  manner.   This 
suggests  that  during  speciation,  not  just  alleles  of  single 
genes  survived,  but  REDS,  represented  by  the  selectable 
genes  within  them,  were  the  units  to  survive  speciation  and 
any  subsequent  population  bottlenecks. 

With  the  pattern  of  evolution  of  I  region  haplotypes 
described,  the  question  arises  as  to  what  forces  dictate  the 
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placement  or  activity  of  a  RHS  in  different  haplotypes.   The 
site  of  a  RHS  is  not  random  and  the  distance  between  any  two 
RHSs  can  vary  drastically.   This  is  seen  in  the  size 
differences  between  REDS  where,  for  example,  REDSl 
encompasses  44  kb,  REDS2  contains  19  kb,  and  REDS3  spans  30 
kb.   This  shows  that  there  is  not  a  size  constraint  on  the 
placement  of  RHSs  within  the  I  region,  but  that  some  other 
factor  dictates  whether  recombination  is  allowed  at  a 
particular  site. 


Recombination.  Selection,  and  Generation 
of  I  Region  Haplotypes 


The  real  significance  of  this  pattern  of  recombination 
within  the  I.  region  can  be  appreciated  when  the  functional 
role  of  the  class  II  molecules  are  taken  into  account.   Cell 
surface  expression  of  these  molecules  depends  on  the 
association  of  the  products  of  two  genes:  the  a  and  )9  genes. 
When  selection  operates  on  two  genes,  the  prediction  is  that 
there  will  be  very  low  frequencies  of  recombination  between 
these  genes.   This  is  exactly  what  is  seen  within  REDSl  for 
the  A/3   and  Aa  genes.   The  proper  expression  of  the  A 
molecule  is  dependent  on  the  successful  association  of  the 
two  chains  in  the  cytosol .   By  selecting  against 
recombination  between  these  two  adjacent  genes,  the 
resulting  linkage  disequilibrium  and  subsequent  co- 
adaptation  of  these  two  genes  insures  that  the  animal  will 
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always  be  able  to  assemble  and  express  an  A  molecule  on  its 
cell  surface.   This  must  be  of  extreme  importance  to 
the  survival  of  the  animal  as  reflected  by  the  observation 
that  there  are  no  examples  of  mice  lacking  an  A  molecule. 
Experimental  evidence  for  the  importance  of  the  co- 
adaptation  of  these  two  genes  has  come  out  of  the  laboratory 
of  Gemain  et  al.  (1985)  where  it  was  shown  that  it  is 
necessary  to  have  haplotype  matched  Aj3Aa  pairs  for  the 
expression  of  an  A  molecule  on  the  cell  surface  in 
transfection  assays  (Figure  5-1) .   Haplotype  matched  pairs 
represent  an  a   and  )8  chain  from  the  same  REDSl  lineage, 
whereas  haplotype  mismatched  pairs  represent  the  product  of 
a  recombinational  event  within  this  REDS,  an  event  which  is 
rarely  ever  seen. 

The  expression  of  the  E  molecule  is  under  a  different 
type  of  control  which  may  reflect  the  lesser  importance  of 
the  E  molecule  expression  to  the  fitness  and  survival  of  the 
animal.   It  has  been  well  documented  that  a  relatively  high 
frequency  of  wild  mice  do  not  express  an  E  molecule  and, 
therefore,  the  necessity  of  the  E  molecule  for  survival  is 
questioned.   Perhaps  at  some  point  earlier  in  the  evolution 
of  the  mouse,  the  E  molecule  played  an  integral  role  in  its 
survival,  but  this  dependence  on  an  E  molecule  apparently 
has  been  lost  at  some  point  prior  to  the  divergence  of  M.  m. 
domesticus  as  a  separate  subspecies.   This  is  not  to  say 
that  the  E0   and  Ea  genes  are  not  co-adapted  to  some  extent, 
but  that  the  mechanism  leading  to  the  expression  of  the  E 
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molecule  differs  significantly  from  that  seen  for  the  A 
molecule. 

Whereas  the  a  and  )3  chains  for  the  A  molecule  are 
encoded  by  genes  within  the  same  REDS,  the  E  molecule  a  and 
P   chains  are  encoded  by  genes  on  two  separate  REDS  and  are 
separated  by  two  RHSs.   Recombination  can,  therefore, 
shuffle  different  alleles  for  the  a  and  ^   genes  between 
haplotypes  leading  to  a  situation,  if  it  was  the  A  molecule, 
that  will  lead  to  the  loss  of  expression.   The  majority  of 
wild  mice  express  an  E  molecule  which  suggests  that  there 
must  be  a  mechanism  which  has  evolved  by  which  the  E 
molecule  can  still  be  expressed  regardless  of  recombination 
separating  the  two  genes. 

The  Ea  gene  shows  only  two  lineages,  the  expressed  form 
and  the  deleted  form.   This  monomorphic  Ea  chain  has  evolved 
so  that  it  can  associate  with  virtually  any  form  of  the  E)3 
chain.   There  also  appears  to  be  a  very  high  frequency  of  a 
single  allele  of  E)3  in  this  panel  of  mice,  i.e.  thirteen  of 
the  twenty-eight  strains  carry  an  H-2^-like  allele.   This 
suggests  that  the  maintenance  of  a  single  Ea  morph  which  can 
associate  well  with  one  form  of  Ej9  and  to  an  adequate  degree 
with  all  other  forms  of  FJS    is  the  mechanism  that  has  evolved 
to  compensate  for  the  inclusion  of  the  a  and  /3  chain  genes 
of  this  class  II  molecule  in  separate  REDS  (Figure  5-2) . 

There  are,  therefore,  two  mechanisms  which  have  evolved 
to  ensure  the  expression  of  a  class  II  molecule  on  the 
surface  of  antigen  presenting  cells.   The  driving  force 
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behind  the  evolution  of  these  two  mechanisms  appears  to  be 
selection  either  at  the  level  of  chain  association  or  at  the 
phenotypic  level  dictated  by  binding  of  foreign  antigen  for 
recognition  by  T  lymphocytes.   It  is  still  not  clear  whether 
there  is  a  positive  selection  for  recombination  at  these 
RHSs,  or  a  negative  selection  against  recombination  at  sites 
which  fall  within  REDS,  or  a  combination  of  both.   The  most 
probable  explanation  is  a  combination  of  both  where  the 
placement  of  a  RHS  directs  recombination  to  that  site  and 
thereby  suppresses  recombination  for  a  distance  flanking  on 
either  side.   This  paradox  can  be  addressed,  to  some  extent, 
by  looking  at  the  evolution  of  the  RHSs  within  a  panel  of 
related  species  and  subspecies  of  the  genus  Mus.   This  may 
shed  some  light  on  the  question  of  whether  recombination  has 
always  been  confined  to  these  RHSs,  or,  when  viewed  over  an 
evolutionary  course  of  8  million  years,  recombination  will 
appear  to  be  random. 
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